the snpgdsPCASNPLoadingClass object, returned from
snpgdsPCASNPLoading
gdsobj
an object of class SNPGDSFileClass,
a SNP GDS file
sample.id
a vector of sample id specifying selected samples;
if NULL, all samples are used
num.thread
the number of CPU cores used
verbose
if TRUE, show information
Details
The sample.id are usually different from the samples used in the
calculation of SNP loadings.
Value
Return a snpgdsPCAClass object, and it is a list:
sample.id
the sample ids used in the analysis
snp.id
the SNP ids used in the analysis
eigenval
eigenvalues
eigenvect
eigenvactors, “# of samples” x “eigen.cnt”
TraceXTX
the trace of the genetic covariance matrix
Bayesian
whether use bayerisan normalization
Author(s)
Xiuwen Zheng
References
Patterson N, Price AL, Reich D (2006)
Population structure and eigenanalysis. PLoS Genetics 2:e190.
Zhu, X., Li, S., Cooper, R. S., and Elston, R. C. (2008).
A unified association analysis approach for family and unrelated samples
correcting for stratification. Am J Hum Genet, 82(2), 352-365.
See Also
snpgdsPCA, snpgdsPCACorr,
snpgdsPCASNPLoading
Examples
# open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())
sample.id <- read.gdsn(index.gdsn(genofile, "sample.id"))
PCARV <- snpgdsPCA(genofile, eigen.cnt=8)
SnpLoad <- snpgdsPCASNPLoading(PCARV, genofile)
# calculate sample eigenvectors from SNP loadings
SL <- snpgdsPCASampLoading(SnpLoad, genofile, sample.id=sample.id[1:100])
diff <- PCARV$eigenvect[1:100,] - SL$eigenvect
summary(c(diff))
# ~ ZERO
# close the genotype file
snpgdsClose(genofile)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(SNPRelate)
Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/SNPRelate/snpgdsPCASampLoading.Rd_%03d_medium.png", width=480, height=480)
> ### Name: snpgdsPCASampLoading
> ### Title: Project individuals onto existing principal component axes
> ### Aliases: snpgdsPCASampLoading
> ### Keywords: PCA GDS GWAS
>
> ### ** Examples
>
> # open an example dataset (HapMap)
> genofile <- snpgdsOpen(snpgdsExampleFileName())
>
> sample.id <- read.gdsn(index.gdsn(genofile, "sample.id"))
>
> PCARV <- snpgdsPCA(genofile, eigen.cnt=8)
Principal Component Analysis (PCA) on SNP genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1 SNP (monomorphic: TRUE, < MAF: NaN, or > missing rate: NaN)
Working space: 279 samples, 8722 SNPs
using 1 (CPU) core
PCA: the sum of all selected genotypes (0, 1 and 2) = 2446510
Wed Jul 6 05:34:48 2016 (internal increment: 1744)
[>.................................................] 0%, ETC: NA [==========>.......................................] 20%, ETC: 0s [====================>.............................] 40%, ETC: 0s [==============================>...................] 60%, ETC: 0s [========================================>.........] 80%, ETC: 0s [==================================================] 100%, ETC: 0s [==================================================] 100%, completed
Wed Jul 6 05:34:48 2016 Begin (eigenvalues and eigenvectors)
Wed Jul 6 05:34:48 2016 Done.
> SnpLoad <- snpgdsPCASNPLoading(PCARV, genofile)
SNP loading:
Working space: 279 samples, 8722 SNPs
Using 1 (CPU) core.
Using the top 8 eigenvectors.
SNP Loading: the sum of all selected genotypes (0, 1 and 2) = 2446510
SNP Loading: Wed Jul 6 05:34:48 2016 0%
SNP Loading: Wed Jul 6 05:34:48 2016 100%
>
> # calculate sample eigenvectors from SNP loadings
> SL <- snpgdsPCASampLoading(SnpLoad, genofile, sample.id=sample.id[1:100])
Sample loading:
Working space: 100 samples, 8722 SNPs
Using 1 (CPU) core.
Using the top 8 eigenvectors.
Sample Loading: the sum of all selected genotypes (0, 1 and 2) = 878146
Sample Loading: Wed Jul 6 05:34:48 2016 0%
Sample Loading: Wed Jul 6 05:34:48 2016 100%
>
> diff <- PCARV$eigenvect[1:100,] - SL$eigenvect
> summary(c(diff))
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.832e-15 -6.939e-17 -9.975e-18 1.506e-17 8.327e-17 3.442e-15
> # ~ ZERO
>
> # close the genotype file
> snpgdsClose(genofile)
>
>
>
>
>
> dev.off()
null device
1
>