R: Calculate p-values from a set of observed test statistics and...
empPvals
R Documentation
Calculate p-values from a set of observed test statistics and
simulated null test statistics
Description
Calculates p-values from a set of observed test statistics and
simulated null test statistics
Usage
empPvals(stat, stat0, pool = TRUE)
Arguments
stat
A vector of calculated test statistics.
stat0
A vector or matrix of simulated or data-resampled null test
statistics.
pool
If FALSE, stat0 must be a matrix with the number of rows equal to
the length of stat. Default is TRUE.
Details
The argument stat must be such that the larger the value is
the more deviated (i.e., "more extreme") from the null hypothesis it is.
Examples include an F-statistic or the absolute value of a t-statistic. The
argument stat0 should be calculated analogously on data that
represents observations from the null hypothesis distribution. The p-values
are calculated as the proportion of values from stat0 that are
greater than or equal to that from stat. If pool=TRUE is
selected, then all of stat0 is used in calculating the p-value for a
given entry of stat. If pool=FALSE, then it is assumed that
stat0 is a matrix, where stat0[i,] is used to calculate the
p-value for stat[i]. The function empPvals calculates
"pooled" p-values faster than using a for-loop.
See page 18 of the Supporting Information in Storey et al. (2005) PNAS
(http://www.pnas.org/content/suppl/2005/08/26/0504609102.DC1/04609SuppAppendix.pdf)
for an explanation as to why calculating p-values from pooled empirical
null statistics and then estimating FDR on these p-values is equivalent to
directly thresholding the test statistics themselves and utilizing an
analogous FDR estimator.
Value
A vector of p-values calculated as described above.
Author(s)
John D. Storey
References
Storey JD and Tibshirani R. (2003) Statistical significance for
genome-wide experiments. Proceedings of the National Academy of Sciences,
100: 9440-9445. http://www.pnas.org/content/100/16/9440.full
Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW. (2005) Significance
analysis of time course microarray experiments. Proceedings of the
National Academy of Sciences, 102 (36), 12837-12842. http://www.pnas.org/content/102/36/12837.full.pdf?with-ds=yes
See Also
qvalue
Examples
# import data
data(hedenfalk)
stat <- hedenfalk$stat
stat0 <- hedenfalk$stat0 #vector from null distribution
# calculate p-values
p.pooled <- empPvals(stat=stat, stat0=stat0)
p.testspecific <- empPvals(stat=stat, stat0=stat0, pool=FALSE)
# compare pooled to test-specific p-values
qqplot(p.pooled, p.testspecific); abline(0,1)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(qvalue)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/qvalue/empPvals.Rd_%03d_medium.png", width=480, height=480)
> ### Name: empPvals
> ### Title: Calculate p-values from a set of observed test statistics and
> ### simulated null test statistics
> ### Aliases: empPvals
> ### Keywords: pvalues
>
> ### ** Examples
>
> # import data
> data(hedenfalk)
> stat <- hedenfalk$stat
> stat0 <- hedenfalk$stat0 #vector from null distribution
>
> # calculate p-values
> p.pooled <- empPvals(stat=stat, stat0=stat0)
> p.testspecific <- empPvals(stat=stat, stat0=stat0, pool=FALSE)
>
> # compare pooled to test-specific p-values
> qqplot(p.pooled, p.testspecific); abline(0,1)
>
>
>
>
>
>
> dev.off()
null device
1
>