Last data update: 2014.03.03

R: Genotype Score for Pairs of Individuals
snpgdsPairScoreR Documentation

Genotype Score for Pairs of Individuals

Description

Calculate the genotype score for pairs of individuals based on identity-by-state (IBS) measure

Usage

snpgdsPairScore(gdsobj, sample1.id, sample2.id, snp.id=NULL,
    method=c("IBS", "GVH", "HVG"),
    type=c("per.pair", "per.snp", "matrix", "gds.file"),
    with.id=TRUE, output=NULL, verbose=TRUE)

Arguments

gdsobj

an object of class SNPGDSFileClass, a SNP GDS file

sample1.id

a vector of sample id specifying selected samples; if NULL, all samples are used

sample2.id

a vector of sample id specifying selected samples; if NULL, all samples are used

snp.id

a vector of snp id specifying selected SNPs; if NULL, all SNPs are used

method

"IBS" – identity-by-state score, "GVH" or "HVG", see Details

type

"per.pair", "per.snp" or "matrix", see Value

with.id

if TRUE, returns "sample.id" and "snp.id"; see Value

output

if type="gds.file", the file name

verbose

if TRUE, show information

Details

Patient (sample1.id) Coded Genotype Donor (sample2.id) Coded Genotype IBS GVH HVG
AA 0 AA 0 2 0 0
AA 0 AB 1 1 0 1
AA 0 BB 2 0 2 2
AB 1 AA 0 1 1 0
AB 1 AB 1 2 0 0
AB 1 BB 2 1 1 0
BB 2 AA 0 0 2 2
BB 2 AB 1 1 0 1
BB 2 BB 2 2 0 0

Value

Return a list:

sample.id

the sample ids used in the analysis, if with.id=TRUE

snp.id

the SNP ids used in the analysis, if with.id=TRUE

score

a matrix of genotype score: if type="per.pair", a # of pairs-by-3 matrix with the first column for average scores, the second column for standard deviation and the third column for the valid number of SNPs; if type="per.snp", a 3-by-# of SNPs matrix with the first row for average scores, the second row for standard deviation and the third row for the valid number of individual pairs; if type="matrix", a # of pairs-by-# of SNPs matrix with rows for pairs of individuals

Author(s)

Xiuwen Zheng

References

Warren, E. H., Zhang, X. C., Li, S., Fan, W., Storer, B. E., Chien, J. W., Boeckh, M. J., et al. (2012). Effect of MHC and non-MHC donor/recipient genetic disparity on the outcome of allogeneic HCT. Blood, 120(14), 2796-806. doi:10.1182/blood-2012-04-347286

See Also

snpgdsIBS

Examples

# open an example dataset (HapMap)
genofile <- snpgdsOpen(snpgdsExampleFileName())

# autosomal SNPs
selsnp <- snpgdsSelectSNP(genofile, autosome.only=TRUE, remove.monosnp=FALSE)

# sample ID
sample.id <- read.gdsn(index.gdsn(genofile, "sample.id"))
father.id <- read.gdsn(index.gdsn(genofile, "sample.annot/father.id"))

offspring.id <- sample.id[father.id != ""]
father.id <- father.id[father.id != ""]


# calculate average genotype scores
z1 <- snpgdsPairScore(genofile, offspring.id, father.id, snp.id=selsnp,
    method="IBS", type="per.pair")
names(z1)
z1$score


# calculate average genotype scores
z2 <- snpgdsPairScore(genofile, offspring.id, father.id, snp.id=selsnp,
    method="IBS", type="per.snp")
names(z2)
mean(z2$score["Avg",])
mean(z2$score["SD",])

plot(z2$score["Avg",], pch=20, cex=0.75, xlab="SNP Index", ylab="IBS score")


# calculate a matrix of genotype scores over samples and SNPs
z3 <- snpgdsPairScore(genofile, offspring.id, father.id, snp.id=selsnp,
    method="IBS", type="matrix")
dim(z3$score)


# output the score matrix to a GDS file
snpgdsPairScore(genofile, offspring.id, father.id, snp.id=selsnp,
    method="IBS", type="gds.file", output="tmp.gds")
(f <- snpgdsOpen("tmp.gds"))
snpgdsClose(f)


# close the file
snpgdsClose(genofile)

unlink("tmp.gds", force=TRUE)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(SNPRelate)
Loading required package: gdsfmt
SNPRelate -- supported by Streaming SIMD Extensions 2 (SSE2)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/SNPRelate/snpgdsPairScore.Rd_%03d_medium.png", width=480, height=480)
> ### Name: snpgdsPairScore
> ### Title: Genotype Score for Pairs of Individuals
> ### Aliases: snpgdsPairScore
> ### Keywords: GDS GWAS
> 
> ### ** Examples
> 
> # open an example dataset (HapMap)
> genofile <- snpgdsOpen(snpgdsExampleFileName())
> 
> # autosomal SNPs
> selsnp <- snpgdsSelectSNP(genofile, autosome.only=TRUE, remove.monosnp=FALSE)
Excluding 365 SNPs on non-autosomes
> 
> # sample ID
> sample.id <- read.gdsn(index.gdsn(genofile, "sample.id"))
> father.id <- read.gdsn(index.gdsn(genofile, "sample.annot/father.id"))
> 
> offspring.id <- sample.id[father.id != ""]
> father.id <- father.id[father.id != ""]
> 
> 
> # calculate average genotype scores
> z1 <- snpgdsPairScore(genofile, offspring.id, father.id, snp.id=selsnp,
+     method="IBS", type="per.pair")
Working space: 120 samples, 8723 SNPs
Method: IBS
Genotype Score:	the sum of all selected genotypes (0, 1 and 2) = 1050236
> names(z1)
[1] "sample.id" "snp.id"    "score"    
> z1$score
           Avg        SD  Num
 [1,] 1.717526 0.4515055 8684
 [2,] 1.732468 0.4426979 8627
 [3,] 1.705618 0.4565496 8669
 [4,] 1.719926 0.4500914 8637
 [5,] 1.734739 0.4425402 8682
 [6,] 1.730484 0.4439950 8634
 [7,] 1.714005 0.4524242 8654
 [8,] 1.711685 0.4537672 8678
 [9,] 1.713018 0.4533966 8680
[10,] 1.717940 0.4513069 8679
[11,] 1.731137 0.4433941 8681
[12,] 1.731879 0.4430065 8664
[13,] 1.729090 0.4447137 8704
[14,] 1.736193 0.4412444 8673
[15,] 1.718750 0.4504039 8672
[16,] 1.728134 0.4449469 8655
[17,] 1.725300 0.4469056 8664
[18,] 1.710132 0.4552509 8666
[19,] 1.718901 0.4516104 8666
[20,] 1.727315 0.4456246 8684
[21,] 1.737098 0.4404983 8623
[22,] 1.730181 0.4441525 8628
[23,] 1.730303 0.4438278 8669
[24,] 1.735549 0.4413276 8667
[25,] 1.715718 0.4518636 8678
[26,] 1.710225 0.4541930 8655
[27,] 1.707154 0.4561096 8653
[28,] 1.721590 0.4487608 8606
[29,] 1.719473 0.4500544 8648
[30,] 1.727619 0.4452107 8668
[31,] 1.733002 0.4426783 8648
[32,] 1.718634 0.4502083 8608
[33,] 1.730570 0.4442080 8685
[34,] 1.730805 0.4440868 8674
[35,] 1.715097 0.4519076 8631
[36,] 1.720217 0.4499450 8671
[37,] 1.701913 0.4589817 8521
[38,] 1.724301 0.4476639 8687
[39,] 1.733771 0.4422714 8688
[40,] 1.711804 0.4544812 8633
[41,] 1.718909 0.4498149 8652
[42,] 1.733318 0.4425108 8677
[43,] 1.708593 0.4572243 8658
[44,] 1.731902 0.4432553 8661
[45,] 1.734972 0.4416351 8667
[46,] 1.731575 0.4434268 8643
[47,] 1.710620 0.4547676 8691
[48,] 1.727727 0.4456772 8609
[49,] 1.724769 0.4474335 8640
[50,] 1.728358 0.4461330 8629
[51,] 1.732209 0.4436158 8656
[52,] 1.711701 0.4535089 8623
[53,] 1.716359 0.4518116 8680
[54,] 1.713773 0.4527913 8633
[55,] 1.732182 0.4436323 8629
[56,] 1.703022 0.4579658 8637
[57,] 1.719082 0.4502431 8668
[58,] 1.709856 0.4548698 8675
[59,] 1.727723 0.4456745 8686
[60,] 1.732596 0.4428909 8676
> 
> 
> # calculate average genotype scores
> z2 <- snpgdsPairScore(genofile, offspring.id, father.id, snp.id=selsnp,
+     method="IBS", type="per.snp")
Working space: 120 samples, 8723 SNPs
Method: IBS
Genotype Score:	the sum of all selected genotypes (0, 1 and 2) = 1050236
> names(z2)
[1] "sample.id" "snp.id"    "score"    
> mean(z2$score["Avg",])
[1] 1.722745
> mean(z2$score["SD",])
[1] 0.4150805
> 
> plot(z2$score["Avg",], pch=20, cex=0.75, xlab="SNP Index", ylab="IBS score")
> 
> 
> # calculate a matrix of genotype scores over samples and SNPs
> z3 <- snpgdsPairScore(genofile, offspring.id, father.id, snp.id=selsnp,
+     method="IBS", type="matrix")
Working space: 120 samples, 8723 SNPs
Method: IBS
Genotype Score:	the sum of all selected genotypes (0, 1 and 2) = 1050236
> dim(z3$score)
[1]   60 8723
> 
> 
> # output the score matrix to a GDS file
> snpgdsPairScore(genofile, offspring.id, father.id, snp.id=selsnp,
+     method="IBS", type="gds.file", output="tmp.gds")
Working space: 120 samples, 8723 SNPs
Method: IBS
Output: /home/ddbj/DataUpdator-rgm3/target/tmp.gds
Genotype Score:	the sum of all selected genotypes (0, 1 and 2) = 1050236
> (f <- snpgdsOpen("tmp.gds"))
File: /home/ddbj/DataUpdator-rgm3/target/tmp.gds (173.5K)
+    [  ]
|--+ sample.id   { Str8 60 ZIP_ra(36.1%), 347B }
|--+ snp.id   { Int32 8723 ZIP_ra(34.8%), 11.9K }
|--+ snp.position   { Int32 8723 ZIP_ra(94.7%), 32.3K }
|--+ snp.chromosome   { Int32 8723 ZIP_ra(0.38%), 132B }
--+ genotype   { Bit2 60x8723, 127.8K } *
> snpgdsClose(f)
> 
> 
> # close the file
> snpgdsClose(genofile)
> 
> unlink("tmp.gds", force=TRUE)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>