Kernel Principal Components Analysis is a nonlinear form of principal
component analysis.
Usage
## S4 method for signature 'formula'
kpca(x, data = NULL, na.action, ...)
## S4 method for signature 'matrix'
kpca(x, kernel = "rbfdot", kpar = list(sigma = 0.1),
features = 0, th = 1e-4, na.action = na.omit, ...)
## S4 method for signature 'kernelMatrix'
kpca(x, features = 0, th = 1e-4, ...)
## S4 method for signature 'list'
kpca(x, kernel = "stringdot", kpar = list(length = 4, lambda = 0.5),
features = 0, th = 1e-4, na.action = na.omit, ...)
Arguments
x
the data matrix indexed by row or a formula describing the
model, or a kernel Matrix of class kernelMatrix, or a list of character vectors
data
an optional data frame containing the variables in
the model (when using a formula).
kernel
the kernel function used in training and predicting.
This parameter can be set to any function, of class kernel, which computes a dot product between two
vector arguments. kernlab provides the most popular kernel functions
which can be used by setting the kernel parameter to the following
strings:
rbfdot Radial Basis kernel function "Gaussian"
polydot Polynomial kernel function
vanilladot Linear kernel function
tanhdot Hyperbolic tangent kernel function
laplacedot Laplacian kernel function
besseldot Bessel kernel function
anovadot ANOVA RBF kernel function
splinedot Spline kernel
The kernel parameter can also be set to a user defined function of
class kernel by passing the function name as an argument.
kpar
the list of hyper-parameters (kernel parameters).
This is a list which contains the parameters to be used with the
kernel function. Valid parameters for existing kernels are :
sigma inverse kernel width for the Radial Basis
kernel function "rbfdot" and the Laplacian kernel "laplacedot".
degree, scale, offset for the Polynomial kernel "polydot"
scale, offset for the Hyperbolic tangent kernel
function "tanhdot"
sigma, order, degree for the Bessel kernel "besseldot".
sigma, degree for the ANOVA kernel "anovadot".
Hyper-parameters for user defined kernels can be passed through the
kpar parameter as well.
features
Number of features (principal components) to
return. (default: 0 , all)
th
the value of the eigenvalue under which principal
components are ignored (only valid when features = 0). (default : 0.0001)
na.action
A function to specify the action to be taken if NAs are
found. The default action is na.omit, which leads to rejection of cases
with missing values on any required variable. An alternative
is na.fail, which causes an error if NA cases
are found. (NOTE: If given, this argument must be named.)
...
additional parameters
Details
Using kernel functions one can efficiently compute
principal components in high-dimensional
feature spaces, related to input space by some non-linear map.
The data can be passed to the kpca function in a matrix or a
data.frame, in addition kpca also supports input in the form of a
kernel matrix of class kernelMatrix or as a list of character
vectors where a string kernel has to be used.
Value
An S4 object containing the principal component vectors along with the
corresponding eigenvalues.
pcv
a matrix containing the principal component vectors (column
wise)
eig
The corresponding eigenvalues
rotated
The original data projected (rotated) on the principal components
xmatrix
The original data matrix
all the slots of the object can be accessed by accessor functions.
Note
The predict function can be used to embed new data on the new space
# another example using the iris
data(iris)
test <- sample(1:150,20)
kpc <- kpca(~.,data=iris[-test,-5],kernel="rbfdot",
kpar=list(sigma=0.2),features=2)
#print the principal component vectors
pcv(kpc)
#plot the data projection on the components
plot(rotated(kpc),col=as.integer(iris[-test,5]),
xlab="1st Principal Component",ylab="2nd Principal Component")
#embed remaining points
emb <- predict(kpc,iris[test,-5])
points(emb,col=as.integer(iris[test,5]))