Imputes the observation for variables from a reference observation to a
target observation. Also, imputes a value for a reference from other
references. This practice is useful for validation (see yai). Variables
not available in the original data may be imputed using argument ancillaryData.
Usage
## S3 method for class 'yai'
impute(object,ancillaryData=NULL,method="closest",
method.factor=method,k=NULL,vars=NULL,
observed=TRUE,...)
Arguments
object
an object of class yai.
ancillaryData
a data frame of variables that may not have been used in
the original call to yai. There must be one row for
each reference observation, no missing data, and row names must match those used
in the reference observations.
method
the method used to compute the imputed values for continuous variables,
as follows: closest: use the single neighbor that is closest (this is the default and is
always used when k=1); mean: the mean of the k neighbors is taken; median: the median of the k neighbors is taken; dstWeighted: a weighted mean is taken over the k neighbors where the
weights are 1/(1+d).
method.factor
the method used to compute the imputed values for factors, as follows: closest: use the single neighbor that is closest (this is the default and is
always used when k=1); mean or median: actually is the mode--it is the factor level that occurs
the most often among the k neighbors; dstWeighted: a mode where the count is the sum of the weights (1/(1+d)) rather than
each having a weight of 1.
k
the number neighbors to use in averages, when NULL all present are used.
vars
a character vector of variables to impute, when NULL, the behaviour depends
on the value of ancillaryData: when it is NULL, the Y-variables are imputed and
otherwise all present in ancillaryData are imputed.
observed
when TRUE, columns are created for observed values (those from the
target observations) as well as imputed values (those from the
reference observations.
...
passed to other methods, currently not used.
Value
An object of class c("impute.yai","data.frame"), with rownames
identifying observations and column names identifying variables. When
observed=TRUE additional columns are created with a suffix of
.o.
NA's fill columns of observed values when no
corresponding value is known, as in the case for Y-variables from
target observations.
Scale factors for each variable are
returned as an attribute (see attributes).
require(yaImpute)
data(iris)
# form some test data
refs=sample(rownames(iris),50)
x <- iris[,1:3] # Sepal.Length Sepal.Width Petal.Length
y <- iris[refs,4:5] # Petal.Width Species
# build a yai object using mahalanobis
mal <- yai(x=x,y=y,method="mahalanobis")
# output a data frame of observed and imputed values
# of all variables and observations.
impute(mal)
malImp=impute(mal,ancillaryData=iris)
plot(malImp)