R: Apply a variance stabilizing transformation (VST) to the...
getVarianceStabilizedData
R Documentation
Apply a variance stabilizing transformation (VST) to the count data
Description
This function calculates a variance stabilizing transformation (VST) from the
fitted dispersion-mean relation(s) and then transforms the count data (normalized
by division by the size factor), yielding a matrix
of values which are now approximately homoskedastic. This is useful as input
to statistical analyses requiring homoskedasticity.
a CountDataSet which also contains the fitted dispersion-mean relation
Details
For each sample (i.e., column of counts(cds)), the full variance function
is calculated from the raw variance (by scaling according to the size factor and adding
the shot noise). The function requires a blind estimate of the variance function, i.e.,
one ignoring conditions. Usually, this is achieved by calling estimateDispersions
with method="blind" before calling it.
A typical workflow is shown in Section Variance stabilizing transformation in the package vignette.
If estimateDispersions was called with fitType="parametric",
a closed-form expression for the variance stabilizing transformation is used on the normalized
count data. The expression can be found in the file ‘vst.pdf’ which is distributed with the vignette.
If estimateDispersions was called with fitType="locfit",
the reciprocal of the square root of the variance of the normalized counts, as derived
from the dispersion fit, is then numerically
integrated, and the integral (approximated by a spline function) is evaluated for each
count value in the column, yielding a transformed value.
In both cases, the transformation is scaled such that for large
counts, it becomes asymptotically (for large values) equal to the
logarithm to base 2.
Limitations: In order to preserve normalization, the same
transformation has to be used for all samples. This results in the
variance stabilizition to be only approximate. The more the size
factors differ, the more residual dependence of the variance on the
mean you will find in the transformed data. As shown in the vignette, you can use the function
meanSdPlot from the package vsn to see whether this
is a problem for your data.
Value
For varianceStabilizingTransformation, an ExpressionSet.
For getVarianceStabilizedData, a matrix of
the same dimension as the count data, containing the transformed
values.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(DESeq)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: locfit
locfit 1.5-9.1 2013-03-22
Loading required package: lattice
Welcome to 'DESeq'. For improved performance, usability and
functionality, please consider migrating to 'DESeq2'.
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/DESeq/getVarianceStabilizedData.Rd_%03d_medium.png", width=480, height=480)
> ### Name: getVarianceStabilizedData
> ### Title: Apply a variance stabilizing transformation (VST) to the count
> ### data
> ### Aliases: getVarianceStabilizedData varianceStabilizingTransformation
>
> ### ** Examples
>
> cds <- makeExampleCountDataSet()
> cds <- estimateSizeFactors( cds )
> cds <- estimateDispersions( cds, method="blind" )
> vsd <- getVarianceStabilizedData( cds )
> colsA <- conditions(cds) == "A"
> plot( rank( rowMeans( vsd[,colsA] ) ), genefilter::rowVars( vsd[,colsA] ) )
>
>
>
>
>
> dev.off()
null device
1
>