Last data update: 2014.03.03
R: Categorize and Decategorize Variables in a Data Frame
categorize R Documentation
Categorize and Decategorize Variables in a Data Frame
Description
The function categorize
defines categories for variables in
a data frame, starting with a user-defined index (e.g. 0 or 1).
Continuous variables can be categorized by defining categories by
discretizing the variables in different quantile groups.
The function decategorize
does the reverse operation.
Usage
categorize(dat, categorical = NULL, quant=NULL , lowest = 0)
decategorize(dat, categ_design = NULL)
Arguments
dat
Data frame
categorical
Vector with variable names which should be converted into categories,
beginning with integer lowest
quant
Vector with number of classes for each variables.
Variables are categorized among quantiles. The vector must
have names containing variable names.
lowest
Lowest category index. Default is 0.
categ_design
Data frame containing informations about
categorization which is the output of categorize
.
Value
For categorize
, it is a list with entries
data
Converted data frame
categ_design
Data frame containing some informations
about categorization
For decategorize
it is a data frame.
Author(s)
Alexander Robitzsch
Examples
## Not run:
library(mice)
library(miceadds)
#############################################################################
# EXAMPLE 1: Categorize questionnaire data
#############################################################################
data(data.smallscale , package="miceadds")
dat <- data.smallscale
# (0) select dataset
dat <- dat[ , 9:20 ]
summary(dat)
categorical <- colnames(dat)
categorical <- colnames(dat)[2:6]
# (1) categorize data
res <- categorize( dat , categorical=categorical )
# (2) multiple imputation using the mice package
dat2 <- res$data
VV <- ncol(dat2)
impMethod <- rep( "sample" , VV ) # define random sampling imputation method
names(impMethod) <- colnames(dat2)
imp <- mice::mice( as.matrix(dat2) , impMethod = impMethod , maxit=1 , m=1 )
dat3 <- mice::complete(imp,action=1)
# (3) decategorize dataset
dat3a <- decategorize( dat3 , categ_design = res$categ_design )
#############################################################################
# EXAMPLE 2: Categorize ordinal and continuous data
#############################################################################
data(data.ma01,package="miceadds")
dat <- data.ma01
summary(dat[,-c(1:2)] )
# define variables to be categorized
categorical <- c("books" , "paredu" )
# define quantiles
quant <- c(6,5,11)
names(quant) <- c("math" , "read" , "hisei")
# categorize data
res <- categorize( dat , categorical = categorical , quant=quant)
str(res)
## End(Not run)
Results