R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Area Under the Precision-Recall Curve (AUPR), Belief...

perfDSC

R Documentation

Area Under the Precision-Recall Curve (AUPR), Belief Confusion Metric (BCM) and Correct Class Enrichment Metric (CCEM).

Description

This function implements the three metrics used in the IMPROVER Diagnostic Signature Challenge.

Usage

perfDSC(pred,gs)

Arguments

`pred`	A belief matrix, with rows coresponding to samples and columns to classes. The values are between 0 and 1 and sum on each row is 1. It needs to have row names. The belief values are the result of a prediction made by a model.
`gs`	A matrix, with rows coresponding to samples and columns to classes that give the true (gold standard) class membership of samples.

Details

See cited documents for more details.

Value

A named vector that includes the BCM, CCEM, AUPR_avg and Accuracy.

Author(s)

Adi Laurentiu Tarca <atarca@med.wayne.edu>

References

Adi L. Tarca, Mario Lauria, Michael Unger, Erhan Bilal, Stephanie Boue, Kushal Kumar Dey, Julia Hoeng, Heinz Koeppl, Florian Martin, Pablo Meyer, Preetam Nandy, Raquel Norel, Manuel Peitsch, Jeremy J Rice, Roberto Romero, Gustavo Stolovitzky, Marja Talikka, Yang Xiang, Christoph Zechner, and IMPROVER DSC Collaborators, Strengths and limitations of microarray-based phenotype prediction: Lessons learned from the IMPROVER Diagnostic Signature Challenge. Bioinformatics, submitted 2013.

Examples

#asume a 3 class classification problem; gs is the gold standard and pred are predictions
gs=cbind(A=c(1,1,1,1,0,0,0,0,0,0,0,0),B=c(0,0,0,0,1,1,1,1,0,0,0,0),C=c(0,0,0,0,0,0,0,0,1,1,1,1))
rownames(gs)<-paste("sample",1:12,sep="")
pred=cbind(A=c(0.6,0.9,1,0.3,0,0,0,0,0,0,0,0),B=c(0.4,0.1,0,0.7,1,1,0.7,1,0,0,0,0),C=c(0,0,0,0,0,0,0.3,0,1,1,1,1))
rownames(pred)<-paste("sample",1:12,sep="")
#male sure the sum per row is 1 is both gs and pred
apply(gs,1,sum)
apply(pred,1,sum)
#compute perfromance
perfDSC(pred,gs)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(maPredictDSC)
Loading required package: MASS
Loading required package: affy
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: limma

Attaching package: 'limma'

The following object is masked from 'package:BiocGenerics':

    plotMA

Loading required package: gcrma
Loading required package: ROC
Loading required package: class
Loading required package: e1071
Loading required package: caret
Loading required package: lattice
Loading required package: ggplot2
Loading required package: hgu133plus2.db
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums


Attaching package: 'AnnotationDbi'

The following object is masked from 'package:MASS':

    select

Loading required package: org.Hs.eg.db


Loading required package: ROCR
Loading required package: gplots

Attaching package: 'gplots'

The following object is masked from 'package:IRanges':

    space

The following object is masked from 'package:S4Vectors':

    space

The following object is masked from 'package:stats':

    lowess

Loading required package: LungCancerACvsSCCGEO
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/maPredictDSC/perfDSC.Rd_%03d_medium.png", width=480, height=480)
> ### Name: perfDSC
> ### Title: Area Under the Precision-Recall Curve (AUPR), Belief Confusion
> ###   Metric (BCM) and Correct Class Enrichment Metric (CCEM).
> ### Aliases: perfDSC
> ### Keywords: parametric methods
> 
> ### ** Examples
> 
> #asume a 3 class classification problem; gs is the gold standard and pred are predictions
> gs=cbind(A=c(1,1,1,1,0,0,0,0,0,0,0,0),B=c(0,0,0,0,1,1,1,1,0,0,0,0),C=c(0,0,0,0,0,0,0,0,1,1,1,1))
> rownames(gs)<-paste("sample",1:12,sep="")
> pred=cbind(A=c(0.6,0.9,1,0.3,0,0,0,0,0,0,0,0),B=c(0.4,0.1,0,0.7,1,1,0.7,1,0,0,0,0),C=c(0,0,0,0,0,0,0.3,0,1,1,1,1))
> rownames(pred)<-paste("sample",1:12,sep="")
> #male sure the sum per row is 1 is both gs and pred
> apply(gs,1,sum)
 sample1  sample2  sample3  sample4  sample5  sample6  sample7  sample8 
       1        1        1        1        1        1        1        1 
 sample9 sample10 sample11 sample12 
       1        1        1        1 
> apply(pred,1,sum)
 sample1  sample2  sample3  sample4  sample5  sample6  sample7  sample8 
       1        1        1        1        1        1        1        1 
 sample9 sample10 sample11 sample12 
       1        1        1        1 
> #compute perfromance
> perfDSC(pred,gs)
NA in cutpts forces recomputation using smallest gap
NA in cutpts forces recomputation using smallest gap
NA in cutpts forces recomputation using smallest gap
      BCM      CCEM      AUPR       AUC 
0.8750000 0.8958333 0.9875000 0.9947917 
> 
> 
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>