Last data update: 2014.03.03

R: Advanced Rank Product Analysis of Microarray
RPadvanceR Documentation

Advanced Rank Product Analysis of Microarray

Description

Advance rank product method to identify differentially expressed genes. It is possible to combine data from different studies, e.g. data sets generated at different laboratories.

Usage

    RPadvance(data,cl,origin,num.perm=100,logged=TRUE,
              na.rm=FALSE,gene.names=NULL,plot=FALSE, 
               rand=NULL, huge=FALSE)

Arguments

data

the data set that should be analyzed. Every row of this data set must correspond to a gene.

cl

a vector containing the class labels of the samples. In the two class unpaired case, the label of a sample is either 0 (e.g., control group) or 1 (e.g., case group). For one group data, the label for each sample should be 1.

origin

a vector containing the origin labels of the sample. e.g. for the data sets generated at multiple laboratories, the label is the same for samples within one lab and different for samples from different labs.

num.perm

number of permutations used in the calculation of the null density. Default is 'B=100'.

logged

if "TRUE", data has bee logged, otherwise set it to "FALSE"

na.rm

if 'FALSE' (default), the NA value will not be used in computing rank. If 'TRUE', the missing values will be replaced by the genewise mean of the non-missing values. Gene will all value missing will be assigned "NA"

gene.names

if "NULL", no gene name will be attached to the estimated percentage of false prediction (pfp).

plot

If "TRUE", plot the estimated pfp verse the rank of each gene

rand

if specified, the random number generator will be put in a reproducible state.

huge

If "TRUE", use an alternative method for computation. Using considerably less memory, this allows the Rank Product to be calculated for larger input data (see details). However, the result will not contain the Orirank value.

For input with n rows, m=m1+m2 columns for two classes, and k permutations, memory requirements are 2n with huge=TRUE instead of n*k+n*m1*m2.

Value

A result of identifying differentially expressed genes between two classes. The identification consists of two parts, the identification of up-regulated and down-regulated genes in class 2 compared to class 1, respectively.

pfp

estimated percentage of false positive predictions (pfp) up to the position of each gene under two identificaiton each

pval

estimated pvalue for each gene being up- and down-regulated

RPs

Original rank-product of each genes for two i dentificaiton each

RPrank

rank of the rank products of each gene in ascending order

Orirank

original ranks in each comparison, which is used to compute rank product. Not present if huge=TRUE is used.

AveFC

fold change of average expression under class 1 over that under class 2, if multiple origin, than avraged across all origin. log-fold change if data is in log scaled, original fold change if data is unlogged.

all.FC

fold change of class 1/class 2 under each origin. log-fold change if data is in log scaled

Note

Percentage of false prediction (pfp), in theory, is equivalent of false discovery rate (FDR), and it is possible to be large than 1.

The function looks for up- and down- regulated genes in two seperate steps, thus two pfps are computed and used to identify gene that belong to each group.

The function is able to replace function RP in the same library. it is a more general version, as it is able to handle data from differnt origins.

Author(s)

Fangxin Hong fhong@salk.edu

References

Breitling, R., Armengaud, P., Amtmann, A., and Herzyk, P.(2004) Rank Products: A simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments, FEBS Letter, 57383-92

See Also

topGene RP plotRP RSadvance

Examples

      # Load the data of Golub et al. (1999). data(golub) 
      # contains a 3051x38 gene expression
      # matrix called golub, a vector of length called golub.cl 
      # that consists of the 38 class labels,
      # and a matrix called golub.gnames whose third column 
      # contains the gene names.
      data(golub)

      ##For data with single origin
      subset <- c(1:4,28:30)
      origin <- rep(1,7)
      #identify genes 
      RP.out <- RPadvance(golub[,subset],golub.cl[subset],
                           origin,plot=FALSE,rand=123)
      
      #For data from multiple origins
      
      #Load the data arab in the package, which contains 
      # the expression of 22,081 genes
      # of control and treatment group from the experiments 
      #indenpently conducted at two 
      #laboratories.
      data(arab)
      arab.origin #1 1 1 1 1 1 2 2 2 2
      arab.cl #0 0 0 1 1 1 0 0 1 1
      RP.adv.out <- RPadvance(arab,arab.cl,arab.origin,
                    num.perm=100,gene.names=arab.gnames,logged=TRUE,rand=123)

      attributes(RP.adv.out)
      head(RP.adv.out$pfp)
      head(RP.adv.out$RPs)
      head(RP.adv.out$AveFC)


Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(RankProd)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/RankProd/RPadvance.Rd_%03d_medium.png", width=480, height=480)
> ### Name: RPadvance
> ### Title: Advanced Rank Product Analysis of Microarray
> ### Aliases: RPadvance
> ### Keywords: htest
> 
> ### ** Examples
> 
>       # Load the data of Golub et al. (1999). data(golub) 
>       # contains a 3051x38 gene expression
>       # matrix called golub, a vector of length called golub.cl 
>       # that consists of the 38 class labels,
>       # and a matrix called golub.gnames whose third column 
>       # contains the gene names.
>       data(golub)
> 
>       ##For data with single origin
>       subset <- c(1:4,28:30)
>       origin <- rep(1,7)
>       #identify genes 
>       RP.out <- RPadvance(golub[,subset],golub.cl[subset],
+                            origin,plot=FALSE,rand=123)
 The data is from  1 different origins 
 
Rank Product analysis for two-class case 
 
Starting  100 permutations... 
Computing pfp... 
>       
>       #For data from multiple origins
>       
>       #Load the data arab in the package, which contains 
>       # the expression of 22,081 genes
>       # of control and treatment group from the experiments 
>       #indenpently conducted at two 
>       #laboratories.
>       data(arab)
>       arab.origin #1 1 1 1 1 1 2 2 2 2
 [1] 1 1 1 1 1 1 2 2 2 2
>       arab.cl #0 0 0 1 1 1 0 0 1 1
 [1] 0 0 0 1 1 1 0 0 1 1
>       RP.adv.out <- RPadvance(arab,arab.cl,arab.origin,
+                     num.perm=100,gene.names=arab.gnames,logged=TRUE,rand=123)
 The data is from  2 different origins 
 
Rank Product analysis for two-class case 
 
Starting  100 permutations... 
Computing pfp... 
> 
>       attributes(RP.adv.out)
$names
[1] "pfp"     "pval"    "RPs"     "RPrank"  "Orirank" "AveFC"   "all.FC" 

>       head(RP.adv.out$pfp)
          class1 < class2 class1 > class 2
244901_at       1.1315068         1.046186
244902_at       1.1514074         1.046474
244903_at       1.2023585         1.028143
244904_at       1.1455596         1.089375
244905_at       1.0909831         1.029432
244906_at       0.8980702         1.058403
>       head(RP.adv.out$RPs)
          class1 < class2 class1 > class 2
244901_at        230.7605         177.5393
244902_at        221.8292         247.0897
244903_at        145.5476         182.6056
244904_at        224.8413         150.9007
244905_at        261.0395         191.0797
244906_at        105.1817         349.2341
>       head(RP.adv.out$AveFC)
          log/unlog(class1/class2)
244901_at              0.060713984
244902_at             -0.005148036
244903_at             -0.008684345
244904_at              0.128771916
244905_at              0.030076779
244906_at             -0.217625131
> 
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>