a numeric matrix of expression data of genes expressed in at least one sample.
ntops
an integer vector of top variable genes, measured by MAD (median absolute
deviation).
K.max
an integer value specifying the maximal number of clusters to compute GAP
statistics.
nboot
an integer value specifying the number of bootstraps, which is an argument B
of function clusGap.
gapsmat
a numeric matrix of GAP statistics.
gapsSE
standard errors of means of the GAP statistics.
Details
GAP statistic is a popular method to estimate the number of clusters in a set
of data by comparing the change in observed and expected within-cluster
dispersion. To identify the optimal number of clusters, GAP statistic can be
computed for k=1 to K.max with nboot bootstraps for ntops
top variable genes in the AMC data set.
The function figGAP is designed to visualize GAP curves.
Value
This function will return a list including gapsmat (a numeric matrix of
GAP statistics) and gapsSE (standard errors of means of the GAP
statistics).
De Sousa E Melo, F. and Wang, X. and Jansen, M. et al. Poor prognosis colon
cancer is defined by a molecularly distinct subtype and precursor lesion.
accepted
Tibshirani, Robert andWalther, Guenther and Hastie, Trevor (2001). Estimating
the number of clusters in a data set via the gap statistic. Journal of the
Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411-423.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(DeSousa2013)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/DeSousa2013/compGapStats.Rd_%03d_medium.png", width=480, height=480)
> ### Name: compGapStats
> ### Title: Computing Gap statistics to identify the optimal number of
> ### subtypes
> ### Aliases: compGapStats figGAP
>
> ### ** Examples
>
> data(ge.CRC, package="DeSousa2013")
> ge.CRC <- ge.all[selPbs, ]
> gaps <- compGapStats(ge.CRC, ntops=c(2, 4)*1000, K.max=6, nboot=10)
> figGAP(gaps$gapsmat, gaps$gapsSE)
>
>
>
>
>
> dev.off()
null device
1
>