Advance rank product method to identify
differentially expressed genes. It is possible to
combine data from different studies, e.g. data sets
generated at different laboratories.
the data set that should be analyzed. Every
row of this data set must correspond to a gene.
cl
a vector containing the class labels of the
samples. In the two class unpaired case, the label
of a sample is either 0 (e.g., control group) or 1
(e.g., case group). For one group data, the label for
each sample should be 1.
origin
a vector containing the origin labels of the
sample. e.g. for
the data sets generated at multiple laboratories, the label
is the same for samples within one lab and different for samples
from different labs.
num.perm
number of permutations used in the calculation
of the null density. Default is 'B=100'.
logged
if "TRUE", data has bee logged, otherwise set
it to "FALSE"
na.rm
if 'FALSE' (default), the NA value will not
be used in computing rank. If 'TRUE', the missing
values will be replaced by the genewise mean of
the non-missing values. Gene will all value missing
will be assigned "NA"
gene.names
if "NULL", no gene name will be attached
to the estimated percentage of false prediction (pfp).
plot
If "TRUE", plot the estimated pfp verse the rank
of each gene
rand
if specified, the random number generator
will be put in a reproducible state.
huge
If "TRUE", use an alternative method for computation. Using
considerably less memory, this allows the Rank Product to be
calculated for larger input data (see details). However, the result
will not contain the Orirank value.
For input with n rows, m=m1+m2 columns for two classes, and k
permutations, memory requirements are 2n with huge=TRUE instead of
n*k+n*m1*m2.
Value
A result of identifying differentially expressed
genes between two classes. The identification consists of two parts,
the identification of up-regulated and down-regulated genes in class 2
compared to class 1, respectively.
pfp
estimated percentage of false positive predictions
(pfp) up to the position of each gene under two
identificaiton each
pval
estimated pvalue for each gene being up- and down-regulated
RPs
Original rank-product of each genes for two i
dentificaiton each
RPrank
rank of the rank products of each gene in
ascending order
Orirank
original ranks in each comparison, which
is used to compute rank product. Not present if huge=TRUE is used.
AveFC
fold change of average expression under class 1 over
that under class 2, if multiple origin, than avraged
across all origin. log-fold change if data is in log scaled,
original fold change if data is unlogged.
all.FC
fold change of class 1/class 2 under each origin.
log-fold change if data is in log scaled
Note
Percentage of false prediction (pfp), in theory, is
equivalent of false discovery rate (FDR), and it is
possible to be large than 1.
The function looks for up- and down- regulated genes in two
seperate steps, thus two pfps are computed and used to identify
gene that belong to each group.
The function is able to replace function RP in the
same library. it is a more general version, as it is
able to handle data from differnt origins.
Breitling, R., Armengaud, P., Amtmann, A., and Herzyk,
P.(2004) Rank Products: A simple, yet powerful, new method
to detect differentially regulated genes in
replicated microarray experiments, FEBS Letter, 57383-92
See Also
topGeneRPplotRPRSadvance
Examples
# Load the data of Golub et al. (1999). data(golub)
# contains a 3051x38 gene expression
# matrix called golub, a vector of length called golub.cl
# that consists of the 38 class labels,
# and a matrix called golub.gnames whose third column
# contains the gene names.
data(golub)
##For data with single origin
subset <- c(1:4,28:30)
origin <- rep(1,7)
#identify genes
RP.out <- RPadvance(golub[,subset],golub.cl[subset],
origin,plot=FALSE,rand=123)
#For data from multiple origins
#Load the data arab in the package, which contains
# the expression of 22,081 genes
# of control and treatment group from the experiments
#indenpently conducted at two
#laboratories.
data(arab)
arab.origin #1 1 1 1 1 1 2 2 2 2
arab.cl #0 0 0 1 1 1 0 0 1 1
RP.adv.out <- RPadvance(arab,arab.cl,arab.origin,
num.perm=100,gene.names=arab.gnames,logged=TRUE,rand=123)
attributes(RP.adv.out)
head(RP.adv.out$pfp)
head(RP.adv.out$RPs)
head(RP.adv.out$AveFC)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(RankProd)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/RankProd/RPadvance.Rd_%03d_medium.png", width=480, height=480)
> ### Name: RPadvance
> ### Title: Advanced Rank Product Analysis of Microarray
> ### Aliases: RPadvance
> ### Keywords: htest
>
> ### ** Examples
>
> # Load the data of Golub et al. (1999). data(golub)
> # contains a 3051x38 gene expression
> # matrix called golub, a vector of length called golub.cl
> # that consists of the 38 class labels,
> # and a matrix called golub.gnames whose third column
> # contains the gene names.
> data(golub)
>
> ##For data with single origin
> subset <- c(1:4,28:30)
> origin <- rep(1,7)
> #identify genes
> RP.out <- RPadvance(golub[,subset],golub.cl[subset],
+ origin,plot=FALSE,rand=123)
The data is from 1 different origins
Rank Product analysis for two-class case
Starting 100 permutations...
Computing pfp...
>
> #For data from multiple origins
>
> #Load the data arab in the package, which contains
> # the expression of 22,081 genes
> # of control and treatment group from the experiments
> #indenpently conducted at two
> #laboratories.
> data(arab)
> arab.origin #1 1 1 1 1 1 2 2 2 2
[1] 1 1 1 1 1 1 2 2 2 2
> arab.cl #0 0 0 1 1 1 0 0 1 1
[1] 0 0 0 1 1 1 0 0 1 1
> RP.adv.out <- RPadvance(arab,arab.cl,arab.origin,
+ num.perm=100,gene.names=arab.gnames,logged=TRUE,rand=123)
The data is from 2 different origins
Rank Product analysis for two-class case
Starting 100 permutations...
Computing pfp...
>
> attributes(RP.adv.out)
$names
[1] "pfp" "pval" "RPs" "RPrank" "Orirank" "AveFC" "all.FC"
> head(RP.adv.out$pfp)
class1 < class2 class1 > class 2
244901_at 1.1315068 1.046186
244902_at 1.1514074 1.046474
244903_at 1.2023585 1.028143
244904_at 1.1455596 1.089375
244905_at 1.0909831 1.029432
244906_at 0.8980702 1.058403
> head(RP.adv.out$RPs)
class1 < class2 class1 > class 2
244901_at 230.7605 177.5393
244902_at 221.8292 247.0897
244903_at 145.5476 182.6056
244904_at 224.8413 150.9007
244905_at 261.0395 191.0797
244906_at 105.1817 349.2341
> head(RP.adv.out$AveFC)
log/unlog(class1/class2)
244901_at 0.060713984
244902_at -0.005148036
244903_at -0.008684345
244904_at 0.128771916
244905_at 0.030076779
244906_at -0.217625131
>
>
>
>
>
>
>
> dev.off()
null device
1
>