R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Imputation using Partial Least Squares for Dimension...

mice.impute.2l.pls2

R Documentation

Imputation using Partial Least Squares for Dimension Reduction

Description

This function imputes a variable with missing values using PLS regression (Mevik & Wehrens, 2007) for a dimension reduction of the predictor space.

Usage

mice.impute.2l.pls2(y, ry, x, type, pls.facs = NULL, 
   pls.impMethod = "pmm", pls.print.progress = TRUE, 
   imputationWeights = rep(1, length(y)), pcamaxcols = 1E+09, 
   tricube.pmm.scale = NULL, min.int.cor = 0, min.all.cor=0, 
   N.largest = 0, pls.title = NULL, print.dims = TRUE,
   pls.maxcols=5000 , ...)

mice.impute.2l.pls(y, ry, x, type, pls.facs = NULL, 
   pls.impMethod = "tricube.pmm2", pls.method = NULL, 
   pls.print.progress = TRUE, imputationWeights = rep(1, length(y)), 
   pcamaxcols = 1E+09, tricube.pmm.scale = NULL, min.int.cor = 0, min.all.cor=0, 
   N.largest = 0, pls.title = NULL, print.dims = TRUE, ...)

Arguments

`y`	Incomplete data vector of length `n`
`ry`	Vector of missing data pattern (`FALSE` – missing, `TRUE` – observed)
`x`	Matrix (`n` x `p`) of complete covariates.
`type`	`type=1` – variable is used as a predictor, `type=4` – create interactions with the specified variable with all other predictors, `type=5` – create a quadratic term of the specified variable `type=6` – if some interactions are specified, ignore the variables with entry `6` when creating interactions `type=-2` – specification of a cluster variable. The cluster mean of the outcome `y` (when eliminating the subject under study) is included as a further predictor in the imputation.
`pls.facs`	Number of factors used in PLS regression. This argument can also be specified as a list defining different numbers of factors for all variables to be imputed.
`pls.impMethod`	Imputation method based in the PLS regression model: `norm` – normal linear regression `pmm` – predictive mean matching (`pmm` method from mice) `pmm5` – predictive mean matching (`pmm5` method from miceadds) `tricube.pmm`/`tricube.pmm2` – predictive mean matching with tricube kernel `xplsfacs` – create only PLS factors of the regression model
`pls.method`	Calculation method of PLS regression. See `pls::plsr` (pls) for more details.
`pls.print.progress`	Print progress during PLS regression.
`imputationWeights`	Vector of sample weights to be used in imputation models.
`pcamaxcols`	Maximum number of principal components.
`tricube.pmm.scale`	Scale factor for tricube predictive mean matching.
`min.int.cor`	Minimum absolute correlation for an interaction of two predictors to be included in the PLS regression model
`min.all.cor`	Minimum absolute correlation for inclusion in the PLS regression model.
`N.largest`	Number of variable to be included which do have the largest absolute correlations.
`pls.title`	Title for progress print in console output.
`print.dims`	An optional logical indicating whether dimensions of inputs should be printed.
`pls.maxcols`	Maximum number of interactions to be created.
`...`	Further arguments to be passed.

Details

The function mice.impute.2l.pls2 uses kernelpls.fit2 instead of kernelpls.fit from the pls package and is a bit faster.

Value

A vector of length nmis=sum(!ry) with imputations if pls.impMethod != "xplsfacs". In case of pls.impMethod == "xplsfacs" a matrix with PLS factors is computed.

Author(s)

Alexander Robitzsch

References

Mevik, B. H., & Wehrens, R. (2007). The pls package: Principal component and partial least squares regression in R. Journal of Statistical Software, 18, 1-24.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: PLS imputation method for internet data
#############################################################################	

data(data.internet)
dat <- data.internet

# specify predictor matrix
predictorMatrix <- matrix( 1 , ncol(dat) , ncol(dat) )
rownames(predictorMatrix) <- colnames(predictorMatrix) <- colnames(dat)
diag( predictorMatrix) <- 0

# use PLS imputation method for all variables
impMethod <- rep( "2l.pls2" , ncol(dat) )
names(impMethod) <- colnames(dat)

# define predictors for interactions (entries with type 4 in predictorMatrix)
predictorMatrix[c("IN1","IN15","IN16"),c("IN1","IN3","IN10","IN13")] <- 4
# define predictors which should appear as linear and quadratic terms (type 5)
predictorMatrix[c("IN1","IN8","IN9","IN10","IN11"),c("IN1","IN2","IN7","IN5")] <- 5

# use 9 PLS factors for all variables
pls.facs <- as.list( rep( 9 , length(impMethod) ) )
names(pls.facs) <- names(impMethod)
pls.facs$IN1 <- 15   # use 15 PLS factors for variable IN1

# choose norm or pmm imputation method
pls.impMethod <- as.list( rep("norm" , length(impMethod) ) )
names(pls.impMethod) <- names(impMethod)
pls.impMethod[ c("IN1","IN6")] <- "pmm5"   

# Model 1: Three parallel chains
imp1 <- mice::mice(data = dat , imputationMethod = impMethod ,  
     m=3 , maxit=5 , predictorMatrix = predictorMatrix ,
     pls.facs = pls.facs , # number of PLS factors
     pls.impMethod = pls.impMethod ,  # Imputation Method in PLS imputation
     pls.print.progress = TRUE )
summary(imp1)

# Model 2: One long chain
imp2 <- mice.1chain(data = dat , imputationMethod = impMethod ,  
     burnin=10 , iter=21 , Nimp=3 , predictorMatrix = predictorMatrix ,
     pls.facs = pls.facs , pls.impMethod = pls.impMethod )
summary(imp2)

## End(Not run)