vsn2 fits the vsn model to the data
in x and returns a vsn object with
the fit parameters and the transformed data matrix.
The data are, typically, feature intensity readings from a
microarray, but this function may also be useful for other kinds of
intensity data that obey an additive-multiplicative error model.
To obtain an object of the same class as x, containing
the normalised data and the same metdata as x, use
fit = vsn2(x, ...)
nx = predict(fit, newdata=x)
or the wrapper justvsn.
Please see the vignette Introduction to vsn.
An object containing the data to which the model is fitted.
reference
Optional, a vsn object from
a previous fit. If this argument is specified, the data in x
are normalized "towards" an existing set of reference arrays whose
parameters are stored in the object reference. If this
argument is not specified, then the data in x are normalized
"among themselves". See Details for a more precise explanation.
strata
Optional, a factor or integer
whose length is nrow(x). It can
be used for stratified normalization (i.e. separate offsets a and
factors b for each level of strata). If missing, all
rows of x are assumed to come from one stratum.
If strata is an integer, its values must cover the range
1,…,n, where n is the number of strata.
lts.quantile
Numeric of length 1. The quantile that is used for
the resistant least trimmed sum of squares regression. Allowed
values are between 0.5 and 1. A value of 1 corresponds to ordinary
least sum of squares regression.
subsample
Integer of length 1. If its value is greater than 0,
the model parameters are
estimated from a subsample of the data of size subsample
only, yet the fitted transformation is
then applied to all data. For large datasets, this can substantially
reduce the CPU time and memory consumption at a negligible loss of
precision. Note that the AffyBatch method of vsn2 sets a value of
30000 for this parameter if it is missing from the function
call - which is different from the behaviour of the other methods.
backgroundsubtract
Logical of length 1: should
local background estimates be subtracted before fitting vsn?
foreground, background
Aligned character vectors of the same length,
naming the channels of x that should be used
as foreground and background values.
verbose
Logical. If TRUE, some messages are printed.
returnData
Logical. If TRUE, the transformed data are returned
in a slot of the resulting vsn object.
Setting this option to FALSE allows saving memory
if the data are not needed.
calib
Character of length 1. Allowed values are affine
and none. The default, affine, corresponds to the
behaviour in package versions <= 3.9, and to what is described in
references [1] and [2]. The option none is an experimental
new feature, in which no affine calibration is performed and only
two global variance stabilisation transformation parameters a
and b are fitted. This functionality might be useful in
conjunction with other calibration methods, such as quantile
normalisation - see the vignette Introduction to vsn.
pstart
Optional, a three-dimensional numeric array that
specifies start values for the iterative parameter
estimation algorithm.
If not specified, the function tries to guess useful start values.
The first dimension corresponds to the levels of strata,
the second dimension to the columns of x and the third dimension
must be 2, corresponding to offsets and factors.
minDataPointsPerStratum
The minimum number of data points per
stratum. Normally there is no need for the user to change this;
refer to the vignette for further documentation.
optimpar
Optional, a list with parameters for the likelihood
optimisation algorithm. Default parameters are taken from
defaultpar. See details.
defaultpar
The default parameters for the likelihood
optimisation algorithm. Values in optimpar take precedence
over those in defaultpar. The purpose of this argument is to
expose the default values in this manual page - it is not
intended to be changed, please use optimpar for that.
...
Arguments that get passed on to vsnMatrix.
Value
An object of class vsn.
Note on overall scale and location of the glog
transformation
The data are returned on a glog scale to base 2. More precisely,
the transformed data are subject to the transformation
glog_2(f(b)*x+a) + c, where the function
glog_2(u) = log_2(u+√{u*u+1}) = asinh(u)/log(2) is called the
generalised logarithm, the offset a and the scaling parameter
b are the fitted model parameters
(see references), and f(x)=exp(x) is a parameter transformation that
allows ensuring positivity of the factor in front of x while
using an unconstrained optimisation over b [4].
The overall offset c is computed from the b's such that for
large x the transformation approximately corresponds to the
log_2 function. This is done separately for each stratum, but with the
same value across arrays. More precisely, if the element b[s,i]
of the array b is the scaling parameter for the s-th
stratum and the i-th array, then c[s] is computed as
log2(2*f(mean(b[,i]))).
The offset c is inconsequential for all differential
expression calculations, but many users like to see the data in a
range that they are familiar with.
Specific behaviour of the different methods
vsn2 methods exist for
ExpressionSet,
NChannelSet,
AffyBatch (from the affy package),
RGList (from the limma package),
matrix and numeric.
If x is an NChannelSet, then
vsn2 is applied to the matrix that is obtained
by horizontally concatenating the color channels.
Optionally, available background estimates can be subtracted before.
If x is an RGList, it is
converted into an NChannelSet
using a copy of Martin Morgan's code for RGList to
NChannelSet coercion, then the NChannelSet method is called.
Standalone versus reference normalisation
If the reference argument is not specified, then the model
parameters μ_k and σ are fit from the data in x.
This is the mode of operation described in [1]
and that was the only option in versions 1.X of this package.
If reference is specified, the model parameters
μ_k and σ are taken from it.
This allows for 'incremental' normalization [4].
Convergence of the iterative likelihood optimisation
L-BFGS-B uses three termination criteria:
(f_k - f_{k+1}) / max(|f_k|, |f_{k+1}|, 1) <= factr * epsmch
where epsmch is the machine precision.
|gradient| < pgtol
iterations > maxit
These are set by the elements factr, pgtol and
maxit of optimpar. The remaining elements are
trace
An integer between 0 and 6, indicating the
verbosity level of L-BFGS-B, higher values
create more output.
cvg.niter
The number of iterations to be used in the least
trimmed sum of squares regression.
cvg.eps
Numeric. A convergence threshold for the least
trimmed sum of squares regression.
Author(s)
Wolfgang Huber
References
[1] Variance stabilization applied to microarray data
calibration and to the quantification of differential expression,
Wolfgang Huber, Anja von Heydebreck, Holger Sueltmann, Annemarie
Poustka, Martin Vingron; Bioinformatics (2002) 18 Suppl.1 S96-S104.
[2] Parameter estimation for the calibration and variance stabilization
of microarray data,
Wolfgang Huber, Anja von Heydebreck, Holger Sueltmann,
Annemarie Poustka, and Martin Vingron;
Statistical Applications in Genetics and Molecular Biology (2003)
Vol. 2 No. 1, Article 3.
http://www.bepress.com/sagmb/vol2/iss1/art3.
[3] L-BFGS-B: Fortran Subroutines for Large-Scale Bound Constrained
Optimization, C. Zhu, R.H. Byrd, P. Lu and J. Nocedal,
Technical Report, Northwestern University (1996).
[4] Package vignette: Likelihood Calculations for vsn
See Also
justvsn, predict
Examples
data("kidney")
fit = vsn2(kidney) ## fit
nkid = predict(fit, newdata=kidney) ## apply fit
plot(exprs(nkid), pch=".")
abline(a=0, b=1, col="red")
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(vsn)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/vsn/vsn2.Rd_%03d_medium.png", width=480, height=480)
> ### Name: vsn2
> ### Title: Fit the vsn model
> ### Aliases: vsn2 vsn2-methods vsn2,matrix-method vsn2,numeric-method
> ### vsn2,ExpressionSet-method vsn2,AffyBatch-method vsn2,RGList-method
> ### vsn2,NChannelSet-method coerce,RGList,NChannelSet-method vsnMatrix
>
> ### ** Examples
>
> data("kidney")
>
> fit = vsn2(kidney) ## fit
> nkid = predict(fit, newdata=kidney) ## apply fit
>
> plot(exprs(nkid), pch=".")
> abline(a=0, b=1, col="red")
>
>
>
>
>
> dev.off()
null device
1
>