R: Find a few approximate largest singular values and...
irlba
R Documentation
Find a few approximate largest singular values and corresponding
singular vectors of a matrix.
Description
The augmented implicitly restarted Lanczos bi-diagonalization (IRLBA)
algorithm finds a few approximate largest singular values and corresponding
singular vectors of a sparse or dense matrix using a method of Baglama and
Reichel. It is a fast and memory-efficient way to compute a partial SVD.
Usage
irlba(A, nv = 5, nu, maxit = 1000, work = nv + 5, reorth = TRUE,
tol = 1e-05, v = NULL, right_only = FALSE, verbose = FALSE, scale,
center, du, ds, dv, shift, mult)
Arguments
A
numeric real- or complex-valued matrix or real-valued sparse matrix.
nv
number of right singular vectors to estimate.
nu
number of left singular vectors to estimate (defaults to nv).
maxit
maximum number of iterations.
work
working subspace dimension, larger values can speed convergence at the cost of more memory use.
reorth
logical value indicating full TRUE or cheaper one-sided FALSE reorthogonalization.
tol
convergence is determined when ||AV - US|| < tol*||A||, where the spectral norm ||A|| is approximated by the largest estimated singular value, and U, V, S are the matrices corresponding to the estimated left and right singular vectors, and diagonal matrix of estimated singular values, respectively.
v
optional starting vector or output from a previous run of irlba used to restart the algorithm from where it left off (see the notes).
right_only
logical value indicating return only the right singular vectors (TRUE) or both sets of vectors (FALSE).
verbose
logical value that when TRUE prints status messages during the computation.
scale
optional column scaling vector whose values divide each column of A; must be as long as the number of columns of A (see notes).
center
optional column centering vector whose values are subtracted from each column of A; must be as long as the number of columns of A and may not be used together with the deflation options below (see notes).
du
optional subspace deflation vector (see notes).
ds
optional subspace deflation scalar (see notes).
dv
optional subspace deflation vector (see notes).
shift
optional shift value (square matrices only, see notes).
mult
optional custom matrix multiplication function (default is '%*%', see notes).
Value
Returns a list with entries:
d max(nu, nv) approximate singular values
u nu approximate left singular vectors (only when right_only=FALSE)
v nv approximate right singular vectors
iter The number of Lanczos iterations carried out
mprod The total number of matrix vector products carried out
Note
The syntax of irlba partially follows svd, with an important
exception. The usual R svd function always returns a complete set of
singular values, even if the number of singular vectors nu or nv
is set less than the maximum. The irlba function returns a number of
estimated singular values equal to the maximum of the number of specified
singular vectors nu and nv.
Use the optional scale parameter to implicitly scale each column of
the matrix A by the values in the scale vector, computing the
truncated SVD of the column-scaled sweep(A,2,scale,FUN=`/`), or
equivalently, A %*% diag(1/scale), without explicitly forming the
scaled matrix. scale must be a non-zero vector of length equal
to the number of columns of A.
Use the optional center parameter to implicitly subtract the values
in the center vector from each column of A, computing the
truncated SVD of sweep(A,2,center,FUN=`-`),
without explicitly forming the centered matrix. This option may not be
used together with the general rank 1 deflation options. center
must be a vector of length equal to the number of columns of A.
This option may be used to efficiently compute principal components without
explicitly forming the centered matrix (which can, importantly, preserve
sparsity in the matrix). See the examples.
Use the optional deflation parameters to compute the rank-one deflated
SVD of A - ds*du %*% t(dv), where
t(du) %*% A - ds * t(dv) == 0. For
example, the triple ds,du,dv may be a known singular value
and corresponding singular vectors. Or ds=1 and dv
and du represent a vector of column means of A and of ones,
respectively. This is a rarely used option, but it is used internally
by the center option and the two sets of parameters are
prevented from being used in combination.
Specify an optional alternative matrix multiplication operator in the
mult parameter. mult must be a function of two arguments,
and must handle both cases where one argument is a vector and the other
a matrix. See the examples. Special care must be taken when deflation
is also desired; see the package vignette for details.
Use the v option to supply a starting vector for the iterative
method. A random vector is used by default. Optionally set v to
the ouput of a previous run of irlba to restart the method, adding
additional singular values/vectors without recomputing the already computed
subspace.
References
Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005.
See Also
svd, prcomp, partial_eigen
Examples
set.seed(1)
A <- matrix(runif(200),nrow=20)
S <- irlba(A)
S$d
# Compare with svd
svd(A)$d[1:5]
# Principal components
P <- irlba(A, nv=1, center=colMeans(A))
# Compare with prcomp (might vary up to sign)
cbind(P$v, prcomp(A)$rotation[,1])
# A custom matrix multiplication function that scales the columns of A
# (cf the scale option). This function scales the columns of A to unit norm.
col_scale <- sqrt(apply(A,2,crossprod))
mult <- function(x,y)
{
# check if x is a plain, row or column vector
if (is.vector(x) || ncol(x)==1 || nrow(x)==1)
{
return((x %*% y)/col_scale)
}
# else x is the matrix
x %*% (y/col_scale)
}
irlba(A, 3, mult=mult)$d
# Compare with:
irlba(A, 3, scale=col_scale)$d
# Compare with:
svd(sweep(A,2,col_scale,FUN=`/`))$d[1:3]