Last data update: 2014.03.03

R: Batch Effects of Genotyping
batchTestR Documentation

Batch Effects of Genotyping

Description

batchChisqTest calculates Chi-square values for batches from 2-by-2 tables of SNPs, comparing each batch with the other batches. batchFisherTest calculates Fisher's exact test values.

Usage

batchChisqTest(genoData, batchVar, snp.include = NULL,
               chrom.include = 1:22, sex.include = c("M", "F"),
               scan.exclude = NULL, return.by.snp = FALSE,
               correct = TRUE, verbose = TRUE)

batchFisherTest(genoData, batchVar, snp.include = NULL,
                chrom.include = 1:22, sex.include = c("M", "F"),
                scan.exclude = NULL, return.by.snp = FALSE,
                conf.int = FALSE, verbose = TRUE)

Arguments

genoData

GenotypeData object

batchVar

A character string indicating which annotation variable should be used as the batch.

snp.include

A vector containing the IDs of SNPs to include.

chrom.include

Integer vector with codes for chromosomes to include. Ignored if snp.include is not NULL. Default is 1:22 (autosomes). Use 23, 24, 25, 26, 27 for X, XY, Y, M, Unmapped respectively

sex.include

Character vector with sex to include. Default is c("M", "F"). If sex chromosomes are present in chrom.include, only one sex is allowed.

scan.exclude

A vector containing the IDs of scans to be excluded.

return.by.snp

Logical value to indicate whether snp-by-batch matrices should be returned.

conf.int

Logical value to indicate if a confidence interval should be computed.

correct

Logical value to specify whether to apply the Yates continuity correction.

verbose

Logical value specifying whether to show progress information.

Details

Because of potential batch effects due to sample processing and genotype calling, batches are an important experimental design factor.

batchChisqTest calculates the Chi square values from 2-by-2 table for each SNP, comparing each batch with the other batches.

batchFisherTest calculates Fisher's Exact Test from 2-by-2 table for each SNP, comparing each batch with the other batches.

For each SNP and each batch, batch effect is evaluated by a 2-by-2 table: # of A alleles, and # of B alleles in the batch, versus # of A alleles, and # of B alleles in the other batches. Monomorphic SNPs are set to NA for all batches.

The default behavior is to combine allele frequencies from males and females and return results for autosomes only. If results for sex chromosomes (X or Y) are desired, use chrom.include with values 23 and/or 25 and sex.include="M" or "F".

If there are only two batches, the calculation is only performed once and the values for each batch will be identical.

Value

batchChisqTest returns a list with the following elements:

mean.chisq

a vector of mean chi-squared values for each batch.

lambda

a vector of genomic inflation factor computed as median(chisq) / 0.456 for each batch.

chisq

a matrix of chi-squared values with SNPs as rows and batches as columns. Only returned if return.by.snp=TRUE.

batchFisherTest returns a list with the following elements:

mean.or

a vector of mean odds-ratio values for each batch. mean.or is computed as 1/mean(pmin(or, 1/or)) since the odds ratio is >1 when the batch has a higher allele frequency than the other batches and <1 for the reverse.

lambda

a vector of genomic inflation factor computed as median(-2*log(pval) / 1.39 for each batch.

Each of the following is a matrix with SNPs as rows and batches as columns, and is only returned if return.by.snp=TRUE:

pval

P value

oddsratio

Odds ratio

confint.low

Low value of the confidence interval for the odds ratio. Only returned if conf.int=TRUE.

confint.high

High value of the confidence interval for the odds ratio. Only returned if conf.int=TRUE.

batchChisqTest and batchFisherTest both also return the following if return.by.snp=TRUE:

allele.counts

matrix with total number of A and B alleles over all batches.

min.exp.freq

matrix of minimum expected allele frequency with SNPs as rows and batches as columns.

Author(s)

Xiuwen Zheng, Stephanie Gogarten

See Also

GenotypeData, chisq.test, fisher.test

Examples

library(GWASdata)
file <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
gds <- GdsGenotypeReader(file)
data(illuminaScanADF)
genoData <-  GenotypeData(gds, scanAnnot=illuminaScanADF)

# autosomes only, sexes combined (default)
res.chisq <- batchChisqTest(genoData, batchVar="plate")
res.chisq$mean.chisq
res.chisq$lambda

# X chromosome for females
res.chisq <- batchChisqTest(genoData, batchVar="status",
  chrom.include=23, sex.include="F", return.by.snp=TRUE)
head(res.chisq$chisq)

# Fisher exact test of "status" on X chromosome for females
res.fisher <- batchFisherTest(genoData, batchVar="status",
  chrom.include=23, sex.include="F", return.by.snp=TRUE)
qqPlot(res.fisher$pval)

close(genoData)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(GWASTools)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/GWASTools/batchTest.Rd_%03d_medium.png", width=480, height=480)
> ### Name: batchTest
> ### Title: Batch Effects of Genotyping
> ### Aliases: batchChisqTest batchFisherTest
> ### Keywords: htest
> 
> ### ** Examples
> 
> library(GWASdata)
> file <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
> gds <- GdsGenotypeReader(file)
> data(illuminaScanADF)
> genoData <-  GenotypeData(gds, scanAnnot=illuminaScanADF)
> 
> # autosomes only, sexes combined (default)
> res.chisq <- batchChisqTest(genoData, batchVar="plate")
Tue Jul  5 21:58:13 2016	Load genotype dataset ...
Tue Jul  5 21:58:13 2016	1/77	WG0052807-AMP2
Tue Jul  5 21:58:13 2016	2/77	WG0052808-AMP2
Tue Jul  5 21:58:13 2016	3/77	WG0052809-AMP2
Tue Jul  5 21:58:13 2016	4/77	WG0052810-AMP2
Tue Jul  5 21:58:13 2016	5/77	WG0052811-AMP2
Tue Jul  5 21:58:13 2016	6/77	WG0052813-AMP2
Tue Jul  5 21:58:13 2016	7/77	WG0052814-AMP2
Tue Jul  5 21:58:13 2016	8/77	WG0052815-AMP2
Tue Jul  5 21:58:13 2016	9/77	WG0052816-AMP2
Tue Jul  5 21:58:14 2016	10/77	WG0052817-AMP2
Tue Jul  5 21:58:14 2016	11/77	WG0052818-AMP2
Tue Jul  5 21:58:14 2016	12/77	WG0053486-AMP2
Tue Jul  5 21:58:14 2016	13/77	WG0053487-AMP2
Tue Jul  5 21:58:14 2016	14/77	WG0053488-AMP2
Tue Jul  5 21:58:14 2016	15/77	WG0053489-AMP2
Tue Jul  5 21:58:14 2016	16/77	WG0053490-AMP2
Tue Jul  5 21:58:14 2016	17/77	WG0053491-AMP2
Tue Jul  5 21:58:14 2016	18/77	WG0053492-AMP2
Tue Jul  5 21:58:14 2016	19/77	WG0053493-AMP2
Tue Jul  5 21:58:14 2016	20/77	WG0053511-AMP2
Tue Jul  5 21:58:14 2016	21/77	WG0053512-AMP2
Tue Jul  5 21:58:14 2016	22/77	WG0053513-AMP2
Tue Jul  5 21:58:14 2016	23/77	WG0053514-AMP2
Tue Jul  5 21:58:14 2016	24/77	WG0054868-AMP2
Tue Jul  5 21:58:14 2016	25/77	WG0054869-AMP2
Tue Jul  5 21:58:14 2016	26/77	WG0054870-AMP2
Tue Jul  5 21:58:14 2016	27/77	WG0054871-AMP2
Tue Jul  5 21:58:14 2016	28/77	WG0058561-AMP2
Tue Jul  5 21:58:14 2016	29/77	WG0058562-AMP2
Tue Jul  5 21:58:14 2016	30/77	WG0058563-AMP2
Tue Jul  5 21:58:14 2016	31/77	WG0058564-AMP2
Tue Jul  5 21:58:14 2016	32/77	WG0058565-AMP2
Tue Jul  5 21:58:14 2016	33/77	WG0058566-AMP2
Tue Jul  5 21:58:14 2016	34/77	WG0058567-AMP2
Tue Jul  5 21:58:14 2016	35/77	WG0060468-AMP2
Tue Jul  5 21:58:14 2016	36/77	WG0060469-AMP2
Tue Jul  5 21:58:14 2016	37/77	WG0060470-AMP2
Tue Jul  5 21:58:14 2016	38/77	WG0060472-AMP2
Tue Jul  5 21:58:14 2016	39/77	WG0060473-AMP2
Tue Jul  5 21:58:14 2016	40/77	WG0060474-AMP2
Tue Jul  5 21:58:14 2016	41/77	WG0060475-AMP2
Tue Jul  5 21:58:14 2016	42/77	WG0060476-AMP2
Tue Jul  5 21:58:14 2016	43/77	WG0060477-AMP2
Tue Jul  5 21:58:14 2016	44/77	WG0060478-AMP2
Tue Jul  5 21:58:14 2016	45/77	WG0060479-AMP2
Tue Jul  5 21:58:14 2016	46/77	WG0061251-AMP2
Tue Jul  5 21:58:14 2016	47/77	WG0061252-AMP2
Tue Jul  5 21:58:14 2016	48/77	WG0061253-AMP2
Tue Jul  5 21:58:14 2016	49/77	WG0061254-AMP2
Tue Jul  5 21:58:14 2016	50/77	WG0061255-AMP2
Tue Jul  5 21:58:14 2016	51/77	WG0061256-AMP2
Tue Jul  5 21:58:14 2016	52/77	WG0061257-AMP2
Tue Jul  5 21:58:14 2016	53/77	WG0061258-AMP2
Tue Jul  5 21:58:14 2016	54/77	WG0061291-AMP2
Tue Jul  5 21:58:14 2016	55/77	WG0061293-AMP2
Tue Jul  5 21:58:14 2016	56/77	WG0061294-AMP2
Tue Jul  5 21:58:14 2016	57/77	WG0061533-AMP2
Tue Jul  5 21:58:14 2016	58/77	WG0061534-AMP2
Tue Jul  5 21:58:14 2016	59/77	WG0061536-AMP2
Tue Jul  5 21:58:14 2016	60/77	WG0061537-AMP2
Tue Jul  5 21:58:14 2016	61/77	WG0061538-AMP2
Tue Jul  5 21:58:14 2016	62/77	WG0061539-AMP2
Tue Jul  5 21:58:14 2016	63/77	WG0061540-AMP2
Tue Jul  5 21:58:14 2016	64/77	WG0063164-AMP2
Tue Jul  5 21:58:14 2016	65/77	WG0063165-AMP2
Tue Jul  5 21:58:14 2016	66/77	WG0063166-AMP2
Tue Jul  5 21:58:14 2016	67/77	WG0063167-AMP2
Tue Jul  5 21:58:14 2016	68/77	WG0065237-AMP2
Tue Jul  5 21:58:14 2016	69/77	WG0065238-AMP2
Tue Jul  5 21:58:14 2016	70/77	WG0065239-AMP2
Tue Jul  5 21:58:14 2016	71/77	WG0065240-AMP2
Tue Jul  5 21:58:14 2016	72/77	WG0065241-AMP2
Tue Jul  5 21:58:14 2016	73/77	WG0065242-AMP2
Tue Jul  5 21:58:14 2016	74/77	WG0065243-AMP2
Tue Jul  5 21:58:14 2016	75/77	WG0065244-AMP2
Tue Jul  5 21:58:14 2016	76/77	WG0065245-AMP2
Tue Jul  5 21:58:14 2016	77/77	WG0065246-AMP2
> res.chisq$mean.chisq
WG0052807-AMP2 WG0052808-AMP2 WG0052809-AMP2 WG0052810-AMP2 WG0052811-AMP2 
     0.1870673      0.1738736      0.2075799      0.2151794      0.2751302 
WG0052813-AMP2 WG0052814-AMP2 WG0052815-AMP2 WG0052816-AMP2 WG0052817-AMP2 
     0.2396656      0.2098169      0.2563460      0.2820289      0.2921853 
WG0052818-AMP2 WG0053486-AMP2 WG0053487-AMP2 WG0053488-AMP2 WG0053489-AMP2 
     0.2629742      0.1629365      0.1897105      0.1924877      0.1655379 
WG0053490-AMP2 WG0053491-AMP2 WG0053492-AMP2 WG0053493-AMP2 WG0053511-AMP2 
     0.1713587      0.1704014      0.2154593      0.1683599      0.1722597 
WG0053512-AMP2 WG0053513-AMP2 WG0053514-AMP2 WG0054868-AMP2 WG0054869-AMP2 
     0.1712416      0.2155127      0.1685254      0.2272651      0.3502184 
WG0054870-AMP2 WG0054871-AMP2 WG0058561-AMP2 WG0058562-AMP2 WG0058563-AMP2 
     0.3067215      0.2653476      0.1736053      0.2070897      0.2138571 
WG0058564-AMP2 WG0058565-AMP2 WG0058566-AMP2 WG0058567-AMP2 WG0060468-AMP2 
     0.1976060      0.1668862      0.1882501      0.2032808      0.2600943 
WG0060469-AMP2 WG0060470-AMP2 WG0060472-AMP2 WG0060473-AMP2 WG0060474-AMP2 
     0.2570877      0.2709914      0.1545591      0.2069080      0.1580206 
WG0060475-AMP2 WG0060476-AMP2 WG0060477-AMP2 WG0060478-AMP2 WG0060479-AMP2 
     0.1627799      0.1699246      0.1971154      0.2566097      0.2263927 
WG0061251-AMP2 WG0061252-AMP2 WG0061253-AMP2 WG0061254-AMP2 WG0061255-AMP2 
     0.1691294      0.1964273      0.2554832      0.2278570      0.2752351 
WG0061256-AMP2 WG0061257-AMP2 WG0061258-AMP2 WG0061291-AMP2 WG0061293-AMP2 
     0.3112525      0.2397782      0.2101280      0.2748663      0.3344871 
WG0061294-AMP2 WG0061533-AMP2 WG0061534-AMP2 WG0061536-AMP2 WG0061537-AMP2 
     0.3437120      0.1626261      0.1898040      0.1661546      0.1552296 
WG0061538-AMP2 WG0061539-AMP2 WG0061540-AMP2 WG0063164-AMP2 WG0063165-AMP2 
     0.2070136      0.1572482      0.1629521      0.1975130      0.1674576 
WG0063166-AMP2 WG0063167-AMP2 WG0065237-AMP2 WG0065238-AMP2 WG0065239-AMP2 
     0.1876182      0.2013255      0.3098652      0.2663549      0.2564204 
WG0065240-AMP2 WG0065241-AMP2 WG0065242-AMP2 WG0065243-AMP2 WG0065244-AMP2 
     0.2825724      0.2923346      0.2629825      0.2600943      0.2576963 
WG0065245-AMP2 WG0065246-AMP2 
     0.2697447      0.2872269 
> res.chisq$lambda
WG0052807-AMP2 WG0052808-AMP2 WG0052809-AMP2 WG0052810-AMP2 WG0052811-AMP2 
             0              0              0              0              0 
WG0052813-AMP2 WG0052814-AMP2 WG0052815-AMP2 WG0052816-AMP2 WG0052817-AMP2 
             0              0              0              0              0 
WG0052818-AMP2 WG0053486-AMP2 WG0053487-AMP2 WG0053488-AMP2 WG0053489-AMP2 
             0              0              0              0              0 
WG0053490-AMP2 WG0053491-AMP2 WG0053492-AMP2 WG0053493-AMP2 WG0053511-AMP2 
             0              0              0              0              0 
WG0053512-AMP2 WG0053513-AMP2 WG0053514-AMP2 WG0054868-AMP2 WG0054869-AMP2 
             0              0              0              0              0 
WG0054870-AMP2 WG0054871-AMP2 WG0058561-AMP2 WG0058562-AMP2 WG0058563-AMP2 
             0              0              0              0              0 
WG0058564-AMP2 WG0058565-AMP2 WG0058566-AMP2 WG0058567-AMP2 WG0060468-AMP2 
             0              0              0              0              0 
WG0060469-AMP2 WG0060470-AMP2 WG0060472-AMP2 WG0060473-AMP2 WG0060474-AMP2 
             0              0              0              0              0 
WG0060475-AMP2 WG0060476-AMP2 WG0060477-AMP2 WG0060478-AMP2 WG0060479-AMP2 
             0              0              0              0              0 
WG0061251-AMP2 WG0061252-AMP2 WG0061253-AMP2 WG0061254-AMP2 WG0061255-AMP2 
             0              0              0              0              0 
WG0061256-AMP2 WG0061257-AMP2 WG0061258-AMP2 WG0061291-AMP2 WG0061293-AMP2 
             0              0              0              0              0 
WG0061294-AMP2 WG0061533-AMP2 WG0061534-AMP2 WG0061536-AMP2 WG0061537-AMP2 
             0              0              0              0              0 
WG0061538-AMP2 WG0061539-AMP2 WG0061540-AMP2 WG0063164-AMP2 WG0063165-AMP2 
             0              0              0              0              0 
WG0063166-AMP2 WG0063167-AMP2 WG0065237-AMP2 WG0065238-AMP2 WG0065239-AMP2 
             0              0              0              0              0 
WG0065240-AMP2 WG0065241-AMP2 WG0065242-AMP2 WG0065243-AMP2 WG0065244-AMP2 
             0              0              0              0              0 
WG0065245-AMP2 WG0065246-AMP2 
             0              0 
> 
> # X chromosome for females
> res.chisq <- batchChisqTest(genoData, batchVar="status",
+   chrom.include=23, sex.include="F", return.by.snp=TRUE)
Tue Jul  5 21:58:14 2016	Load genotype dataset ...
Tue Jul  5 21:58:14 2016	1/2	0
> head(res.chisq$chisq)
                0         1
1029644 1.8777647 1.8777647
1029716 0.8607194 0.8607194
1029753 0.0000000 0.0000000
1029754 0.7582069 0.7582069
1029822 0.0000000 0.0000000
1029830 0.8960438 0.8960438
> 
> # Fisher exact test of "status" on X chromosome for females
> res.fisher <- batchFisherTest(genoData, batchVar="status",
+   chrom.include=23, sex.include="F", return.by.snp=TRUE)
Tue Jul  5 21:58:14 2016	Load genotype dataset ...
Tue Jul  5 21:58:14 2016	1/2	0
> qqPlot(res.fisher$pval)
> 
> close(genoData)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>