R: alternating least squares multivariate curve resolution...
als
R Documentation
alternating least squares multivariate curve resolution (MCR-ALS)
Description
This is an implementation of alternating least squares
multivariate curve resolution (MCR-ALS). Given a dataset in matrix
form d1, the dataset is decomposed as d1=C %*% t(S)
where the columns of C and S represent components
contributing to the data in each of the 2-ways that the matrix is
resolved. In forming the decomposition, the components in each way
many be constrained with e.g., non-negativity, uni-modality,
selectivity, normalization of S and closure of C. Note
that if more than one dataset is to be analyzed simultaneously, then
the matrix S is assumed to be the same for every dataset in the
bilinear decomposition of each dataset into matrices C and
S.
list with the same length as PsiList where each
element is a matrix of dimension m by comp and
represents the matrix C for each dataset
PsiList
list of datasets, where each dataset is a matrix of dimension
m by n
S
matrix with n rows and comp columns,
often representing (mass) spectra
WList
An optional list with the same length as PsiList,
where each element is a matrix of dimension m by n giving
the weight of that datapoint; note that if closure or normalization
constraints are applied, then both are applied after the application
of weights.
thresh
numeric value that defaults to .001; if
((oldrss - rss) / oldrss) < thresh then the optimization stops,
where oldrss is the residual sum of squares at iteration
x-1 and rss is the residual sum of squares at iteration
x
maxiter
The maximum number of iterations to perform (where an
iteration is optimization of either AList and C)
forcemaxiter
Logical indicating whether maxiter
iterations should be performed even if the residual difference
drops below thresh.
optS1st
logical indicating whether the first constrained least
squares regression should estimate S or CList.
x
optional vector of labels for the rows of C, which are
used in the application of unimodality constraints.
x2
optional vector of labels for the rows of S, which are
used in the application of unimodality constraints.
baseline
logical indicating whether a baseline component is
present; if baseline=TRUE then this component is exempt from
constraints unimodality or non-negativity
fixed
list with the same length as PsiList in which each
element is a vector of the indices of the components to fix to zero
in each dataset
nonnegS
logical indicating whether the components (columns) of
the matrix S should be constrained to non-negative values
nonnegC
logical indicating whether the components (columns) of
the matrix C should be constrained to non-negative values
uniC
logical indicating whether unimodality constraints should be
applied to the columns of C
uniS
logical indicating whether unimodality constraints should be
applied to the columns of S
normS
numeric indicating whether the spectra are normalized; if
normS>0, the spectra are normalized. If normS==1 the
maximum of the spectrum of each component is constrained to be equal
to one; if normS > 0 && normS!=1 then the norm of the
spectrum of each component is constrained to be equal to one.
closureC
list; if the length is zero, then
no closure constraints are applied. If the length is not zero, it
should be equal to the number of datasets in the analysis, and contain
numeric vectors consisting of the desired value of
the sum of each row of the concentration matrix.
Value
A list with components:
CList
A list with the same length as the number of datasets,
containing the optimized matrix C at termination scaled by
the optimized amplitudes for that dataset from AList.
S
The matrix S given as input.
rss
The residual sum of squares at termination.
resid
A list with the same length as the number of datasets,
containing the residual matrix for each dataset
iter
The number of iterations performed before termination.
Note
This function was used to solve problems described in
van Stokkum IHM, Mullen KM, Mihaleva VV. Global analysis of multiple
gas chromatography-mass spectrometry (GS/MS) data sets: A method for
resolution of co-eluting components with comparison to MCR-ALS.
Chemometrics and Intelligent Laboratory Systems 2009; 95(2): 150-163.
in conjunction with the package TIMP. For the code to reproduce
the examples in this paper, see examples_chemo.zip included in the
inst directory of the package source code. .
References
Garrido M, Rius FX, Larrechi MS. Multivariate curve resolution
alternating least squares (MCR-ALS) applied to spectroscopic data from
monitoring chemical reactions processes. Journal Analytical and
Bioanalytical Chemistry 2008; 390:2059-2066.
Jonsson P, Johansson A, Gullberg J, Trygg J, A J, Grung B, Marklund S,
Sjostrom M, Antti H, Moritz T. High-throughput data analysis for
detecting and identifying differences between samples in GC/MS-based
metabolomic analyses. Analytical Chemistry 2005; 77:5635-5642.
Tauler R. Multivariate curve resolution applied to second order data.
Chemometrics and Intelligent Laboratory Systems 1995; 30:133-146.
Tauler R, Smilde A, Kowalski B. Selectivity, local rank, three-way data
analysis and ambiguity in multivariate curve resolution. Journal of
Chemometrics 1995; 9:31-58.
See Also
matchFactor,multiex,multiex1,
plotS
Examples
## load 2 matrix datasets into variables d1 and d2
## load starting values for elution profiles
## into variables Cstart1 and Cstart2
## load time labels as x, m/z values as x2
data(multiex)
## starting values for elution profiles
matplot(x,Cstart1,type="l")
matplot(x,Cstart2,type="l",add=TRUE)
## using MCR-ALS, improve estimates for mass spectra S and the two
## matrices of elution profiles
## apply unimodality constraints to the elution profile estimates
## note that the starting estimates for S just contain a dummy matrix
test0 <- als(CList=list(Cstart1,Cstart2),S=matrix(1,nrow=400,ncol=2),
PsiList=list(d1,d2), x=x, x2=x2, uniC=TRUE, normS=0)
## plot the estimated mass spectra
plotS(test0$S,x2)
## the known mass spectra are contained in the variable S
## can compare the matching factor of each estimated spectrum to
## that in S
matchFactor(S[,1],test0$S[,1])
matchFactor(S[,2],test0$S[,2])
## plot the estimated elution profiles
## this shows the relative abundance of the 2nd component is low
matplot(x,test0$CList[[1]],type="l")
matplot(x,test0$CList[[2]],type="l",add=TRUE)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(ALS)
Loading required package: nnls
Loading required package: Iso
Iso 0.0-17
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/ALS/als.Rd_%03d_medium.png", width=480, height=480)
> ### Name: als
> ### Title: alternating least squares multivariate curve resolution
> ### (MCR-ALS)
> ### Aliases: als
> ### Keywords: optimize
>
> ### ** Examples
>
> ## load 2 matrix datasets into variables d1 and d2
> ## load starting values for elution profiles
> ## into variables Cstart1 and Cstart2
> ## load time labels as x, m/z values as x2
> data(multiex)
>
> ## starting values for elution profiles
> matplot(x,Cstart1,type="l")
> matplot(x,Cstart2,type="l",add=TRUE)
>
> ## using MCR-ALS, improve estimates for mass spectra S and the two
> ## matrices of elution profiles
> ## apply unimodality constraints to the elution profile estimates
> ## note that the starting estimates for S just contain a dummy matrix
>
> test0 <- als(CList=list(Cstart1,Cstart2),S=matrix(1,nrow=400,ncol=2),
+ PsiList=list(d1,d2), x=x, x2=x2, uniC=TRUE, normS=0)
Initial RSS 3.039967e+13
Iteration (opt. S): 1, RSS: 1.330703e+12, RD: 0.9562264
Iteration (opt. C): 2, RSS: 153488187, RD: 0.9998847
Iteration (opt. S): 3, RSS: 102433454, RD: 0.3326297
Iteration (opt. C): 4, RSS: 102351694, RD: 0.0007981757
Initial RSS / Final RSS = 3.039967e+13 / 102351694 = 297011.9
>
> ## plot the estimated mass spectra
> plotS(test0$S,x2)
>
> ## the known mass spectra are contained in the variable S
> ## can compare the matching factor of each estimated spectrum to
> ## that in S
> matchFactor(S[,1],test0$S[,1])
[,1]
[1,] 0.9999994
> matchFactor(S[,2],test0$S[,2])
[,1]
[1,] 0.9999917
>
> ## plot the estimated elution profiles
> ## this shows the relative abundance of the 2nd component is low
> matplot(x,test0$CList[[1]],type="l")
> matplot(x,test0$CList[[2]],type="l",add=TRUE)
>
>
>
>
>
> dev.off()
null device
1
>