Last data update: 2014.03.03

R: Apply Fisher's Exact Tests To Two Amplicon Frequency Sets Of...
ampliconduoR Documentation

Apply Fisher's Exact Tests To Two Amplicon Frequency Sets Of The Same Sample

Description

Implements Fisher's exact test to detect amplicons with significant deviating read numbers between two amplicon sets of the same sample. The p-values of the Fisher's exact test are corrected for multiple testing by computation of the false discovery rates q. This function is intended to help identifying reads that may be the results of experimental artefacts. (The calculation can take some time depending on the size of the data sets and the computing power.)

Usage

ampliconduo(A, B = NULL, sample.names = NULL, correction = "fdr", ...)

Arguments

A

A list or a data frame containing amplicon occurences / number of reads per amplicon (integer values).

B

Optional. A list or a data frame containing amplicon occurences.

sample.names

Optional. A vector or list of characters with names for the amplicon pairs.

correction

Optional. Specifies the correction method for the p-values from Fisher's exact test. Accepts one of the following characters: "holm", "hochberg", "hommel", "bonferroni", "BH", "BY","fdr" and "none". Default is "fdr". For more details see p.adjust.

...

Arguments passed to the internally called fisher.test function.

Details

If only A is specified, it is assumed that the list elements 1 & 2, 3 & 4 etc. of A are amplicon data of the same sample. In case A and B are specified, the ith frequency set of A and B are combined. For each amplicon data pair, frequencies at the corresponding positions in the lists are assumed to belong to the same amplicon. It is required, that two frequency sets that belong to the same sample, an ampliconduo, have the same length. The ampliconduo function iterates over all amplicon pairs and performs the following tasks:

  • amplicons with frequency zero in both samples are removed. Position information is retained.

  • For each amplicon Fisher's exact test using the method fisher.test is performed. The p-value, odds ratio and confidence interval are returned. Via the ..., arguments conf.level, or and alternative can be passed to the fisher.test function call. Default values are conf.level = 0.95, or = 1 and alternative = "two.sided".

  • The p-values are corrected using the p.adjust function. By default the method by Benjamini & Hochberg (1995) is used. Setting the correction argument to any of the following characters "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none", the adjustment method for the p-values can be changed. See function p.adjust.

The AmpliconDuo package implements further methods to visualize and filter the returned ampliconduo data frames.

Value

A list of data frames, one for each amplicon pair, that will be called ampliconduo data frame in the following. List entries are named according to the specified sample.names or numbered.

Each ampliconduo data frame has 9 columns

  • freqA: frequencies of amplicon set A

  • freqB: frequencies of amplicon set B (taken from argument B if specified)

  • p: p-values calculated with Fisher's exact test

  • OR: odds ratio calculated with Fisher's exact test

  • CI.low: lower confidence limit for OR

  • CI.up: upper confidence limit for OR

  • rejected: logical, indicating whether the amplicon was rejected

  • sample: sample name taken from sample.name if specified, same for all rows in a given data frame

Author(s)

Anja Lange and Daniel Hoffmann

References

Y Benjamini and Y Hochberg. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1):289-300, 1995.

See Also

fisher.test, used to calculate the p-value, odds ratio and confidence interval;

p.adjust, called to correct the p-values;

methods to visualize or further manipulate the ampliconduo data frames:

plotAmpliconduo.set, plotAmpliconduo, discordance.delta,

Examples


## loads read numbers from example amplicon data sets
data(ampliconfreqs)
data(site.f)

## generate ampliconduo data frames 
ampliconduos.a <- ampliconduo(A = ampliconfreqs[,1:4], sample.names = site.f[1:2])
ampliconduos.b <- ampliconduo(A = ampliconfreqs[c(1,3)],
B = ampliconfreqs[c(2,4)], sample.names = site.f[1:2],
conf.level = 0.9)

## frequency plot
plotAmpliconduo.set(ampliconduos.a)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(AmpliconDuo)
Loading required package: ggplot2
Loading required package: xtable
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/AmpliconDuo/ampliconduo.Rd_%03d_medium.png", width=480, height=480)
> ### Name: ampliconduo
> ### Title: Apply Fisher's Exact Tests To Two Amplicon Frequency Sets Of The
> ###   Same Sample
> ### Aliases: ampliconduo
> ### Keywords: htest
> 
> ### ** Examples
> 
> 
> ## loads read numbers from example amplicon data sets
> data(ampliconfreqs)
> data(site.f)
> 
> ## generate ampliconduo data frames 
> ampliconduos.a <- ampliconduo(A = ampliconfreqs[,1:4], sample.names = site.f[1:2])
..> ampliconduos.b <- ampliconduo(A = ampliconfreqs[c(1,3)],
+ B = ampliconfreqs[c(2,4)], sample.names = site.f[1:2],
+ conf.level = 0.9)
..> 
> ## frequency plot
> plotAmpliconduo.set(ampliconduos.a)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>