Last data update: 2014.03.03

R: Summary statistics for gene expression
computeLogRatioR Documentation

Summary statistics for gene expression

Description

Compute summary statistics per gene of expression data in a ExpressionSet object.

Usage

computeLogRatio(e, reference, within = NULL, across = NULL, nReplicatesVar = 3, ...)

Arguments

e

An object of class ExpressionSet

reference

A list with two items: var and level - See details

within

Character vector - names of pData columns - See details

across

Character vector - names of pData columns - See details

nReplicatesVar

Integer - Minimum number of replicates to compute variances

...

...

Details

Summary statistics (mean, variances and difference to reference or control) will be computed on the 'exprs' slot of the ExpressionSet object. The parameters of the computation are specified by the parameters 'reference', 'within' and 'across'.

The design of the computations is such that the differences and pooled variances are calculated against the sample(s) that was(were) chosen as reference. The reference is specified by the level of a certain variable in the phenoData slot (e.g.: column 'control' and level 'WT' of the phenoData slot or a boolean ('ref') variable with 0 or 1) – the list object of 'var' and 'level' together determine the reference group.

All groups determined by combining the reference$var and across variables will be compared to the reference group. Two different approaches to obtain necessary computations:

  • Prepare a boolean variable that reflects only the reference group and specify all groupings in the across arguments. E.g.: reference=list(var = 'boolean', level = 1), across = c('compound','dose')

  • Add an extra column to the phenoData slot that contains all combinations, with a specific one for the reference group: for example, pData(e)['refvar'] <- paste(pData(e)['compound'], pData(e)['dose'],sep='.') so as to use reference = list(var = 'refvar', level ='comp1.dose1') as argument for reference.

Sometimes computations need to be conducted within groups, and are thus nested. For example, when comparing treament values of different cell lines, each will have gene expression values for its own reference. The parameter 'within' allows to define such subgroups, for which computations will be done separately and combined afterwards. Both parameters 'within' and 'across' can be a vector of column names, whose unique combinations will be used for groupings.

Value

Returns an object of class ExpressionSet with pData inherited from the submitted ExpressionSet object, supplemented by the computed statistics in the 'exprs' slot and info thereof in the 'phenoData' slot.

Author(s)

Eric Lecoutre

See Also

plotLogRatio

Examples

if (require(ALL)){
data(ALL, package = "ALL")
ALL <- addGeneInfo(ALL)
ALL$BTtype <- as.factor(substr(ALL$BT,0,1))
ALL2 <- ALL[,ALL$BT != 'T1']  # omit subtype T1 as it only contains one sample
ALL2$BTtype <- as.factor(substr(ALL2$BT,0,1)) # create a vector with only T and B

# Test for differential expression between B and T cells
tTestResult <- tTest(ALL, "BTtype", probe2gene = FALSE)
topGenes <- rownames(tTestResult)[1:20]

# plot the log ratios versus subtype B of the top genes 
LogRatioALL <- computeLogRatio(ALL2, reference=list(var='BT',level='B'))
a <- plotLogRatio(e=LogRatioALL[topGenes,],openFile=FALSE, tooltipvalues=FALSE, device='X11',
		colorsColumnsBy=c('BTtype'), main = 'Top 20 genes most differentially between T- and B-cells',
		orderBy = list(rows = "hclust"),
		probe2gene = TRUE)
}

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(a4Base)
Loading required package: grid
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: annaffy
Loading required package: GO.db

Loading required package: KEGG.db

KEGG.db contains mappings based on older data because the original
  resource was removed from the the public domain before the most
  recent update was produced. This package should now be considered
  deprecated and future versions of Bioconductor may not have it
  available.  Users who want more current data are encouraged to look
  at the KEGGREST or reactome.db packages

Loading required package: mpm
Loading required package: MASS

Attaching package: 'MASS'

The following object is masked from 'package:AnnotationDbi':

    select

Loading required package: KernSmooth
KernSmooth 2.23 loaded
Copyright M. P. Wand 1997-2009

mpm version 1.0-22

Loading required package: genefilter

Attaching package: 'genefilter'

The following object is masked from 'package:MASS':

    area

Loading required package: limma

Attaching package: 'limma'

The following object is masked from 'package:BiocGenerics':

    plotMA

Loading required package: multtest
Loading required package: glmnet
Loading required package: Matrix

Attaching package: 'Matrix'

The following object is masked from 'package:S4Vectors':

    expand

Loading required package: foreach
Loaded glmnet 2.0-5

Loading required package: a4Preproc
Loading required package: a4Core

Attaching package: 'a4Core'

The following object is masked from 'package:limma':

    topTable

Loading required package: gplots

Attaching package: 'gplots'

The following object is masked from 'package:multtest':

    wapply

The following object is masked from 'package:IRanges':

    space

The following object is masked from 'package:S4Vectors':

    space

The following object is masked from 'package:stats':

    lowess


a4Base version 1.20.0

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/a4Base/computeLogRatio.Rd_%03d_medium.png", width=480, height=480)
> ### Name: computeLogRatio
> ### Title: Summary statistics for gene expression
> ### Aliases: computeLogRatio
> ### Keywords: manip data dplot
> 
> ### ** Examples
> 
> if (require(ALL)){
+ data(ALL, package = "ALL")
+ ALL <- addGeneInfo(ALL)
+ ALL$BTtype <- as.factor(substr(ALL$BT,0,1))
+ ALL2 <- ALL[,ALL$BT != 'T1']  # omit subtype T1 as it only contains one sample
+ ALL2$BTtype <- as.factor(substr(ALL2$BT,0,1)) # create a vector with only T and B
+ 
+ # Test for differential expression between B and T cells
+ tTestResult <- tTest(ALL, "BTtype", probe2gene = FALSE)
+ topGenes <- rownames(tTestResult)[1:20]
+ 
+ # plot the log ratios versus subtype B of the top genes 
+ LogRatioALL <- computeLogRatio(ALL2, reference=list(var='BT',level='B'))
+ a <- plotLogRatio(e=LogRatioALL[topGenes,],openFile=FALSE, tooltipvalues=FALSE, device='X11',
+ 		colorsColumnsBy=c('BTtype'), main = 'Top 20 genes most differentially between T- and B-cells',
+ 		orderBy = list(rows = "hclust"),
+ 		probe2gene = TRUE)
+ }
Loading required package: ALL
Loading required package: hgu95av2.db
Loading required package: org.Hs.eg.db


> 
> 
> 
> 
> 
> dev.off()
png 
  2 
>