Last data update: 2014.03.03

R: Calculate AUC value from a TCC-class object
calcAUCValueR Documentation

Calculate AUC value from a TCC-class object

Description

This function calculates AUC (Area under the ROC curve) value from a TCC-class object for simulation study.

Usage

calcAUCValue(tcc, t = 1)

Arguments

tcc

TCC-class object having values in both stat$rank and simulation$trueDEG fields.

t

numeric value (between 0 and 1) specifying the FPR (i.e., the x-axis of ROC curve). AUC value is calculated from 0 to t. The default is 1.

Details

This function is generally used after the estimateDE function that estimates p-values (and the derivatives such as the q-values and the ranks) for individual genes based on the statistical model for differential expression (DE) analysis. In case of the simulation analysis, we know which genes are DEGs or non-DEGs in advance and the information is stored in the simulation$trueDEG field of the TCC-class object tcc (i.e., tcc$simulation$trueDEG). The calcAUCValue function calculates the AUC value between the ranked gene list obtained by the estimateDE function and the truth obtained by the simulateReadCounts function. A well-ranked gene list should have a high AUC value (i.e., high sensitivity and specificity).

Value

numeric scalar.

Examples

# Analyzing a simulation data for comparing two groups
# (G1 vs. G2) with biological replicates.
# the first 200 genes are DEGs, where 180 are up-regulated in G1.
# The DE analysis is performed by an exact test in edgeR coupled
# with the DEGES/edgeR normalization factors.
tcc <- simulateReadCounts(Ngene = 1000, PDEG = 0.2,
                          DEG.assign = c(0.9, 0.1),
                          DEG.foldchange = c(4, 4), 
                          replicates = c(3, 3))
tcc <- calcNormFactors(tcc, norm.method = "tmm", test.method = "edger",
                       iteration = 1, FDR = 0.1, floorPDEG = 0.05)
tcc <- estimateDE(tcc, test.method = "edger", FDR = 0.1)
calcAUCValue(tcc)


# Analyzing a simulation data for comparing two groups
# (G1 vs. G2) without replicates.
# the levels of DE are 3-fold in G1 and 7-fold in G2
# The DE analysis is performed by an negative binomial test in
# DESeq coupled with the DEGES/DESeq normalization factors.
tcc <- simulateReadCounts(Ngene = 1000, PDEG = 0.2,
                          DEG.assign = c(0.9, 0.1),
                          DEG.foldchange = c(3, 7), 
                          replicates = c(1, 1))
tcc <- calcNormFactors(tcc, norm.method = "deseq", test.method = "deseq",
                       iteration = 1, FDR = 0.1, floorPDEG = 0.05)
tcc <- estimateDE(tcc, test.method = "deseq", FDR = 0.1)
calcAUCValue(tcc)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(TCC)
Loading required package: DESeq
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: locfit
locfit 1.5-9.1 	 2013-03-22
Loading required package: lattice
    Welcome to 'DESeq'. For improved performance, usability and
    functionality, please consider migrating to 'DESeq2'.
Loading required package: DESeq2
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment

Attaching package: 'DESeq2'

The following objects are masked from 'package:DESeq':

    estimateSizeFactorsForMatrix, getVarianceStabilizedData,
    varianceStabilizingTransformation

Loading required package: edgeR
Loading required package: limma

Attaching package: 'limma'

The following object is masked from 'package:DESeq2':

    plotMA

The following object is masked from 'package:DESeq':

    plotMA

The following object is masked from 'package:BiocGenerics':

    plotMA

Loading required package: baySeq
Loading required package: abind
Loading required package: perm
Loading required package: ROC

Attaching package: 'TCC'

The following object is masked from 'package:edgeR':

    calcNormFactors

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/TCC/calcAUCValue.Rd_%03d_medium.png", width=480, height=480)
> ### Name: calcAUCValue
> ### Title: Calculate AUC value from a TCC-class object
> ### Aliases: calcAUCValue
> ### Keywords: methods
> 
> ### ** Examples
> 
> # Analyzing a simulation data for comparing two groups
> # (G1 vs. G2) with biological replicates.
> # the first 200 genes are DEGs, where 180 are up-regulated in G1.
> # The DE analysis is performed by an exact test in edgeR coupled
> # with the DEGES/edgeR normalization factors.
> tcc <- simulateReadCounts(Ngene = 1000, PDEG = 0.2,
+                           DEG.assign = c(0.9, 0.1),
+                           DEG.foldchange = c(4, 4), 
+                           replicates = c(3, 3))
TCC::INFO: Generating simulation data under NB distribution ...
TCC::INFO: (genesizes   :  1000 )
TCC::INFO: (replicates  :  3, 3 )
TCC::INFO: (PDEG        :  0.18, 0.02 )
> tcc <- calcNormFactors(tcc, norm.method = "tmm", test.method = "edger",
+                        iteration = 1, FDR = 0.1, floorPDEG = 0.05)
TCC::INFO: Calculating normalization factors using DEGES
TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 )
TCC::INFO: Done.
> tcc <- estimateDE(tcc, test.method = "edger", FDR = 0.1)
TCC::INFO: Identifying DE genes using edger ...
TCC::INFO: Done.
> calcAUCValue(tcc)
NA in cutpts forces recomputation using smallest gap
[1] 0.8970781
> 
> 
> # Analyzing a simulation data for comparing two groups
> # (G1 vs. G2) without replicates.
> # the levels of DE are 3-fold in G1 and 7-fold in G2
> # The DE analysis is performed by an negative binomial test in
> # DESeq coupled with the DEGES/DESeq normalization factors.
> tcc <- simulateReadCounts(Ngene = 1000, PDEG = 0.2,
+                           DEG.assign = c(0.9, 0.1),
+                           DEG.foldchange = c(3, 7), 
+                           replicates = c(1, 1))
TCC::INFO: Generating simulation data under NB distribution ...
TCC::INFO: (genesizes   :  1000 )
TCC::INFO: (replicates  :  1, 1 )
TCC::INFO: (PDEG        :  0.18, 0.02 )
> tcc <- calcNormFactors(tcc, norm.method = "deseq", test.method = "deseq",
+                        iteration = 1, FDR = 0.1, floorPDEG = 0.05)
TCC::INFO: Calculating normalization factors using DEGES
TCC::INFO: (iDEGES pipeline : deseq - [ deseq - deseq ] X 1 )
TCC::INFO: Done.
> tcc <- estimateDE(tcc, test.method = "deseq", FDR = 0.1)
TCC::INFO: Identifying DE genes using deseq ...
TCC::INFO: Done.
> calcAUCValue(tcc)
NA in cutpts forces recomputation using smallest gap
[1] 0.7105812
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>