The function creates a loop to compare for each variable the values it have with the usual ones that typical R formats have in order to correct, for example, missing value or dates stored as a character. It also specify for each variable the most appropriate SPSS format that it should have.
The name of the identification variable included in the data frame. It will be used to list the individuals who had any problems during the execution of the function.
force
If TRUE, run format_corrector even if "fixed.formats" attribute is TRUE
rate.miss.date
The maximum rate of missing date fields we want the function to accept.The function details which fields have been lost anyways.
Details
If the date variable don't have chron format it must be in one of the following formats, else the function leaves it as a character: —-dates separator must be one of the following:("-","/","."). —-hour separator must be ":".
Value
A single data frame which results from the function.
Note
This function may not be completely optimal so it might have problems when correcting huge data frames.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(ImportExport)
Loading required package: xlsx
Loading required package: rJava
Loading required package: xlsxjars
Loading required package: gdata
gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.
gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.
Attaching package: 'gdata'
The following object is masked from 'package:stats':
nobs
The following object is masked from 'package:utils':
object.size
The following object is masked from 'package:base':
startsWith
Loading required package: Hmisc
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Loading required package: ggplot2
Attaching package: 'Hmisc'
The following object is masked from 'package:gdata':
combine
The following objects are masked from 'package:base':
format.pval, round.POSIXt, trunc.POSIXt, units
Loading required package: chron
Loading required package: RODBC
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/ImportExport/format_corrector.Rd_%03d_medium.png", width=480, height=480)
> ### Name: format_corrector
> ### Title: Identify and corrects variable formats
> ### Aliases: format_corrector
> ### Keywords: format_corrector
>
> ### ** Examples
>
> require(ImportExport)
> a<-c(1,NA,3,5,".")
> b<-c("19/11/2006","05/10/2011","09/02/1906","22/01/1956","10/10/2010")
> c<-101:105
> x<-data.frame(a,b,c)
> sapply(x,class)
a b c
"factor" "factor" "integer"
> x_corr<-format_corrector(x)
-----Fixing variable ' a '---------
The following SPSS format has been assigned: F2.0
-----Fixing variable ' b '---------
The following SPSS format has been assigned: DATE11
-----Fixing variable ' c '---------
The following SPSS format has been assigned: F4.0
> sapply(x_corr,class)
$a
[1] "numeric"
$b
[1] "dates" "times"
$c
[1] "numeric"
>
>
>
>
>
>
> dev.off()
null device
1
>