R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Gene-by-gene and posterior p-value calculation function.

genediff

R Documentation

Gene-by-gene and posterior p-value calculation function.

Description

Computes two sets of p-values per gene or probe via gene-by-gene ANOVA, using both the gene-specific MSE and the posterior MSE for each term in the ANOVA. P-values are not adjusted for multiple testing.
Assumes a fixed effects model and that the correct denominator for all comparisons is the MSE.

Usage

genediff(eS, model = NULL, method = c("MLE", "MOM", "MOMlog"), 
verbose = TRUE)

Arguments

`eS`	An `ExpressionSet` object. Any transformation and normalization of `exprs(eS)` should be conducted prior to use in `genediff`.
`model`	Model used for comparison; see details and `LMGene`.
`method`	Method by which posterior p-values are calculated. Default `"MLE"`.
`verbose`	If `TRUE`, the prior degrees of freedom and mean reciprocal precision are printed. See details.

Details

The argument eS must be an ExpressionSet object from the Biobase package. If you have data in a matrix and information about experimental design factors, then you can use neweS to convert the data into an ExpressionSet object. Please see neweS for more detail.

The model argument is an optional character string, constructed like the right-hand side of a formula for lm. It specifies which of the variables in the ExpressionSet will be used in the model and whether interaction terms will be included. If model=NULL, it uses all variables from the ExpressionSet without interactions. Be careful of using interaction terms with factors; this often leads to overfitting, which will yield an error.

The method argument specifies how the adjusted MSE and degrees of freedom should be calculated for use in computation of the posterior p-values:

"MLE": Default. Calculate adjusted MSE and degrees of freedom by maximum likelihood estimation, as described in Wright and Simon (2003).
"MOM": Calculate adjusted MSE and degrees of freedom by method of moments, as described in Rocke (2003).
"MOMlog": Calculate adjusted MSE and degrees of freedom by method of moments on log scale, as described in Smyth (2004). Uses functions fitFdist and trigammainverse from the package limma. Note that the method of Smyth (2004) is used here to calculate the posterior MSE, but not to directly calculate the posterior p-values.

All three methods assume that the gene-specific MSE's follow a gamma distribution with mean tau. (NB: Notation and parameterization vary somewhat between each of the source papers.) The mean of the gamma distribution, tau, is modeled with an inverse gamma prior with hyperparameters alpha and beta. Empirical Bayes methods are used to estimate the prior hyperparameters, either by maximum likelihood, method of moments, or method of moments on the log scale. The "posterior MSE" is the posterior mean of the variances given the observed gene-specific MSE's.

If verbose = TRUE, the function prints the estimated prior degrees of freedom, which equals twice the prior shape parameter alpha, and the estimated prior mean reciprocal precision, or 1/(alpha*beta).

All p-values are calculated from fixed-effects ANOVA F statistics, using either the gene-specific MSE or the posterior MSE as the denominator.

Value

A list with components:

`Gene.Specific`	A matrix of p-values calculated using the gene-specific MSE, with one row for each gene/probe and one column for each factor
`Posterior`	A matrix of p-values calculated using the posterior MSE, with one row for each gene/probe and one column for each factor

Author(s)

David Rocke, Geun-Cheol Lee, and Blythe Durbin-Johnson

References

Rocke, D.M. (2003) Heterogeneity of variance in gene expression microarray data. http://dmrocke.ucdavis.edu/papers/empbayes2.pdf

Smyth, G.K (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 3, Article 3. http://www.bepress.com/sagmb/vol3/iss1/art3/

Wright, G.W. and Simon, R.M. (2003) A random variance model for detection of differential gene expression in small microarray experiments. Bioinformatics 19, 2448–2455.

http://dmrocke.ucdavis.edu

Examples

library(Biobase)
library(LMGene)

#data
data(sample.mat)
data(vlist)

raw.eS <- neweS(sample.mat, vlist)

# glog transform data
trans.eS <- transeS(raw.eS, lambda = 727, alpha = 56)

# calculate p-values
pvlist <- genediff(trans.eS)
pvlist$Posterior[1:5,]

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(LMGene)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: multtest
Loading required package: survival
Loading required package: affy

Attaching package: 'LMGene'

The following object is masked from 'package:base':

    norm

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/LMGene/genediff.Rd_%03d_medium.png", width=480, height=480)
> ### Name: genediff
> ### Title: Gene-by-gene and posterior p-value calculation function.
> ### Aliases: genediff
> ### Keywords: models
> 
> ### ** Examples
> 
> library(Biobase)
> library(LMGene)
> 
> #data
> data(sample.mat)
> data(vlist)
> 
> raw.eS <- neweS(sample.mat, vlist)
> 
> # glog transform data
> trans.eS <- transeS(raw.eS, lambda = 727, alpha = 56)
> 
> # calculate p-values
> pvlist <- genediff(trans.eS)
Prior d.f. =  22.68697 
Prior mean reciprocal precision =  0.1982512 
> pvlist$Posterior[1:5,]
          patient       dose
[1,] 0.0079317225 0.06071940
[2,] 0.1223818164 0.09744905
[3,] 0.4233728219 0.16469200
[4,] 0.0024519273 0.17133653
[5,] 0.0002135652 0.18053914
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>