an object of class SNPGDSFileClass,
a SNP GDS file
sample.id
a vector of sample id specifying selected samples;
if NULL, all samples are used
snp.id
a vector of snp id specifying selected SNPs;
if NULL, all SNPs are used
autosome.only
if TRUE, use autosomal SNPs only; if it is a
numeric or character value, keep SNPs according to the specified
chromosome
remove.monosnp
if TRUE, remove monomorphic SNPs
maf
to use the SNPs with ">= maf" only; if NaN, no MAF threshold
missing.rate
to use the SNPs with "<= missing.rate" only;
if NaN, no missing threshold
num.thread
the number of (CPU) cores used; if NA, detect
the number of cores automatically
verbose
if TRUE, show information
Details
The minor allele frequency and missing rate for each SNP passed in
snp.id are calculated over all the samples in sample.id.
The details will be described in future.
Value
Return a class "snpgdsDissClass":
sample.id
the sample ids used in the analysis
snp.id
the SNP ids used in the analysis
diss
a matrix of individual dissimilarity
Author(s)
Xiuwen Zheng
References
Zheng, Xiuwen. 2013. Statistical Prediction of HLA Alleles and
Relatedness Analysis in Genome-Wide Association Studies. PhD dissertation,
the department of Biostatistics, University of Washington.
Weir BS, Zheng X. SNPs and SNVs in Forensic Science. 2015.
Forensic Science International: Genetics Supplement Series.
See Also
snpgdsHCluster
Examples
# open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())
pop.group <- as.factor(read.gdsn(index.gdsn(
genofile, "sample.annot/pop.group")))
pop.level <- levels(pop.group)
diss <- snpgdsDiss(genofile)
hc <- snpgdsHCluster(diss)
# close the genotype file
snpgdsClose(genofile)
# split
set.seed(100)
rv <- snpgdsCutTree(hc, label.H=TRUE, label.Z=TRUE)
# draw dendrogram
snpgdsDrawTree(rv, main="HapMap Phase II",
edgePar=list(col=rgb(0.5,0.5,0.5, 0.75), t.col="black"))
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(SNPRelate)
Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/SNPRelate/snpgdsDiss.Rd_%03d_medium.png", width=480, height=480)
> ### Name: snpgdsDiss
> ### Title: Individual dissimilarity analysis
> ### Aliases: snpgdsDiss
> ### Keywords: GDS GWAS
>
> ### ** Examples
>
> # open an example dataset (HapMap)
> genofile <- snpgdsOpen(snpgdsExampleFileName())
>
> pop.group <- as.factor(read.gdsn(index.gdsn(
+ genofile, "sample.annot/pop.group")))
> pop.level <- levels(pop.group)
>
> diss <- snpgdsDiss(genofile)
Individual dissimilarity analysis on SNP genotypes:
Excluding 365 SNPs on non-autosomes
Excluding 1 SNP (monomorphic: TRUE, < MAF: NaN, or > missing rate: NaN)
Working space: 279 samples, 8722 SNPs
using 1 (CPU) core
Dissimilarity: the sum of all selected genotypes (0, 1 and 2) = 2446510
Dissimilarity: Wed Jul 6 05:34:33 2016 0%
Dissimilarity: Wed Jul 6 05:34:34 2016 100%
> hc <- snpgdsHCluster(diss)
>
> # close the genotype file
> snpgdsClose(genofile)
>
>
> # split
> set.seed(100)
> rv <- snpgdsCutTree(hc, label.H=TRUE, label.Z=TRUE)
Determine groups by permutation (Z threshold: 15, outlier threshold: 5):
Create 3 groups.
>
> # draw dendrogram
> snpgdsDrawTree(rv, main="HapMap Phase II",
+ edgePar=list(col=rgb(0.5,0.5,0.5, 0.75), t.col="black"))
>
>
>
>
>
> dev.off()
null device
1
>