R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Extension of PCA to analyze multiple data sets

mbpca

R Documentation

Extension of PCA to analyze multiple data sets

Description

Three approaches are supplied in this function, consensus PCA (CPCA), generalized CCA (GCCA) and multiple co-inertia analsyis (MCIA).

Usage

mbpca(x, ncomp, method, k = "all", center = TRUE, scale = FALSE, 
  option = "uniform", maxiter = 1000, moa = TRUE, verbose = TRUE, 
  svd.solver = c("svd", "fast.svd", "propack"))

Arguments

`x`	A `list` of `matrix` or `data.frame`, where rows are variables and columns are samples. The columns among the matrices need to be match but the variables do not need to be.
`ncomp`	An integer; the number of components to calculate. To calculate more components requires longer computational time.
`method`	A character string could be one of c("globalScore", "blockScore", "blockLoading"). The "globalScore" approach equals consensus PCA; The "blockScore" approach equals generalized canonical correlation analysis (GCCA); The "blockLoading" approach equals multiple co-inertia anaysis (MCIA);
`k`	The absolute number (if k >= 1) or the proportion (if 0<k<1) of non-zero coefficients for the variable loading vectors. It could be a single value or a vector has the same length as x so the sparsity of individual matrix could be different.
`center`	Logical; if the variables should be centered
`scale`	Logical; if the variables should be scaled
`option`	A charater string could be one of c("lambda1", "inertia", "uniform") to indicate how the different matrices should be normalized. If "lambda1", the matrix is divided by its the first singular value, if "inertia", the matrix is divided by its total inertia (sum of square), if "uniform", none of them would be done.
`maxiter`	Integer; Maximum number of iterations in the algorithm
`moa`	Logical; whether the output should be converted to an object of class `moa-class`
`verbose`	Logical; whether the process (# of PC) should be printed
`svd.solver`	A charater string could be one of c("svd", "fast.svd", "propack"). The default "fast.svd " has a good compromise between the robustness and speed. "propack" is the fastest but may failed to converge in practice.

Details

details need to update

Value

An object of class moa-class (if moa=TRUE) or an list object contains the following elements:

tb - the block scores

pb - the block loadings

t - the global scores

w - the wegihts of block scores to construct the global scor

Note

no note now

Author(s)

Chen Meng

References

reference need to be updated

Examples


data("NCI60_4arrays")
tumorType <- sapply(strsplit(colnames(NCI60_4arrays$agilent), split="\."), "[", 1)
colcode <- as.factor(tumorType)
levels(colcode) <- c("red", "green", "blue", "cyan", "orange", 
                     "gray25", "brown", "gray75", "pink")
colcode <- as.character(colcode)



moa <- mbpca(NCI60_4arrays, ncomp = 10, k = "all", method = "globalScore", option = "lambda1", 
             center=TRUE, scale=FALSE)
plot(moa, value="eig", type=2)
r <- bootMbpca(moa, mc.cores = 1, B=6, replace = FALSE, resample = "sample")

moas <- mbpca(NCI60_4arrays, ncomp = 3, k = 0.1, method = "globalScore", option = "lambda1", 
              center=TRUE, scale=FALSE)


scr <- moaScore(moa)
scrs <- moaScore(moas)
diag(cor(scr[, 1:3], scrs))

layout(matrix(1:2, 1, 2))
plot(scrs[, 1:2], col=colcode, pch=20)
legend("topright", legend = unique(tumorType), col=unique(colcode), pch=20)
plot(scrs[, 2:3], col=colcode, pch=20)

gap <- moGap(moas, K.max = 12, cluster = "hcl")
gap$nClust


hcl <- hclust(dist(scrs))
cls <- cutree(hcl, k=4)
clsColor <- as.factor(cls)
levels(clsColor) <- c("red", "blue", "orange", "pink")
clsColor <- as.character((clsColor))

heatmap(t(scrs[hcl$order, ]), ColSideColors = colcode[hcl$order], Rowv = NA, Colv=NA)
heatmap(t(scrs[hcl$order, ]), ColSideColors = clsColor[hcl$order], Rowv = NA, Colv=NA)

genes <- moaCoef(moas)
genes$nonZeroCoef$agilent.V1.neg

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(mogsa)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/mogsa/mbpca.Rd_%03d_medium.png", width=480, height=480)
> ### Name: mbpca
> ### Title: Extension of PCA to analyze multiple data sets
> ### Aliases: mbpca
> ### Keywords: multi-blcok PCA GCCA CPCA MCIA
> 
> ### ** Examples
> 
> 
> data("NCI60_4arrays")
> tumorType <- sapply(strsplit(colnames(NCI60_4arrays$agilent), split="\."), "[", 1)
> colcode <- as.factor(tumorType)
> levels(colcode) <- c("red", "green", "blue", "cyan", "orange", 
+                      "gray25", "brown", "gray75", "pink")
> colcode <- as.character(colcode)
> 
> 
> 
> moa <- mbpca(NCI60_4arrays, ncomp = 10, k = "all", method = "globalScore", option = "lambda1", 
+              center=TRUE, scale=FALSE)
calculating component 1 ...
calculating component 2 ...
calculating component 3 ...
calculating component 4 ...
calculating component 5 ...
calculating component 6 ...
calculating component 7 ...
calculating component 8 ...
calculating component 9 ...
calculating component 10 ...
> plot(moa, value="eig", type=2)
> r <- bootMbpca(moa, mc.cores = 1, B=6, replace = FALSE, resample = "sample")
method is set to 'globalScore'.
> 
> moas <- mbpca(NCI60_4arrays, ncomp = 3, k = 0.1, method = "globalScore", option = "lambda1", 
+               center=TRUE, scale=FALSE)
calculating component 1 ...
calculating component 2 ...
calculating component 3 ...
> 
> 
> scr <- moaScore(moa)
> scrs <- moaScore(moas)
> diag(cor(scr[, 1:3], scrs))
      PC1       PC2       PC3 
0.9741884 0.9889647 0.9546203 
> 
> layout(matrix(1:2, 1, 2))
> plot(scrs[, 1:2], col=colcode, pch=20)
> legend("topright", legend = unique(tumorType), col=unique(colcode), pch=20)
> plot(scrs[, 2:3], col=colcode, pch=20)
> 
> gap <- moGap(moas, K.max = 12, cluster = "hcl")
> gap$nClust
   firstSEmax Tibs2001SEmax   globalSEmax      firstmax     globalmax 
            4             4             7             4             7 
> 
> 
> hcl <- hclust(dist(scrs))
> cls <- cutree(hcl, k=4)
> clsColor <- as.factor(cls)
> levels(clsColor) <- c("red", "blue", "orange", "pink")
> clsColor <- as.character((clsColor))
> 
> heatmap(t(scrs[hcl$order, ]), ColSideColors = colcode[hcl$order], Rowv = NA, Colv=NA)
> heatmap(t(scrs[hcl$order, ]), ColSideColors = clsColor[hcl$order], Rowv = NA, Colv=NA)
> 
> genes <- moaCoef(moas)
> genes$nonZeroCoef$agilent.V1.neg
                           id         coef
FGD3_agilent     FGD3_agilent -0.105242702
TMC6_agilent     TMC6_agilent -0.045236957
GMFG_agilent     GMFG_agilent -0.042502839
IQGAP2_agilent IQGAP2_agilent -0.001185483
> 
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>