Last data update: 2014.03.03

R: Area Under the Precision-Recall Curve (AUPR), Belief...
perfDSCR Documentation

Area Under the Precision-Recall Curve (AUPR), Belief Confusion Metric (BCM) and Correct Class Enrichment Metric (CCEM).

Description

This function implements the three metrics used in the IMPROVER Diagnostic Signature Challenge.

Usage

perfDSC(pred,gs)

Arguments

pred

A belief matrix, with rows coresponding to samples and columns to classes. The values are between 0 and 1 and sum on each row is 1. It needs to have row names. The belief values are the result of a prediction made by a model.

gs

A matrix, with rows coresponding to samples and columns to classes that give the true (gold standard) class membership of samples.

Details

See cited documents for more details.

Value

A named vector that includes the BCM, CCEM, AUPR_avg and Accuracy.

Author(s)

Adi Laurentiu Tarca <atarca@med.wayne.edu>

References

Adi L. Tarca, Mario Lauria, Michael Unger, Erhan Bilal, Stephanie Boue, Kushal Kumar Dey, Julia Hoeng, Heinz Koeppl, Florian Martin, Pablo Meyer, Preetam Nandy, Raquel Norel, Manuel Peitsch, Jeremy J Rice, Roberto Romero, Gustavo Stolovitzky, Marja Talikka, Yang Xiang, Christoph Zechner, and IMPROVER DSC Collaborators, Strengths and limitations of microarray-based phenotype prediction: Lessons learned from the IMPROVER Diagnostic Signature Challenge. Bioinformatics, submitted 2013.

See Also

predictDSC

Examples

#asume a 3 class classification problem; gs is the gold standard and pred are predictions
gs=cbind(A=c(1,1,1,1,0,0,0,0,0,0,0,0),B=c(0,0,0,0,1,1,1,1,0,0,0,0),C=c(0,0,0,0,0,0,0,0,1,1,1,1))
rownames(gs)<-paste("sample",1:12,sep="")
pred=cbind(A=c(0.6,0.9,1,0.3,0,0,0,0,0,0,0,0),B=c(0.4,0.1,0,0.7,1,1,0.7,1,0,0,0,0),C=c(0,0,0,0,0,0,0.3,0,1,1,1,1))
rownames(pred)<-paste("sample",1:12,sep="")
#male sure the sum per row is 1 is both gs and pred
apply(gs,1,sum)
apply(pred,1,sum)
#compute perfromance
perfDSC(pred,gs)



Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(maPredictDSC)
Loading required package: MASS
Loading required package: affy
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: limma

Attaching package: 'limma'

The following object is masked from 'package:BiocGenerics':

    plotMA

Loading required package: gcrma
Loading required package: ROC
Loading required package: class
Loading required package: e1071
Loading required package: caret
Loading required package: lattice
Loading required package: ggplot2
Loading required package: hgu133plus2.db
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums


Attaching package: 'AnnotationDbi'

The following object is masked from 'package:MASS':

    select

Loading required package: org.Hs.eg.db


Loading required package: ROCR
Loading required package: gplots

Attaching package: 'gplots'

The following object is masked from 'package:IRanges':

    space

The following object is masked from 'package:S4Vectors':

    space

The following object is masked from 'package:stats':

    lowess

Loading required package: LungCancerACvsSCCGEO
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/maPredictDSC/perfDSC.Rd_%03d_medium.png", width=480, height=480)
> ### Name: perfDSC
> ### Title: Area Under the Precision-Recall Curve (AUPR), Belief Confusion
> ###   Metric (BCM) and Correct Class Enrichment Metric (CCEM).
> ### Aliases: perfDSC
> ### Keywords: parametric methods
> 
> ### ** Examples
> 
> #asume a 3 class classification problem; gs is the gold standard and pred are predictions
> gs=cbind(A=c(1,1,1,1,0,0,0,0,0,0,0,0),B=c(0,0,0,0,1,1,1,1,0,0,0,0),C=c(0,0,0,0,0,0,0,0,1,1,1,1))
> rownames(gs)<-paste("sample",1:12,sep="")
> pred=cbind(A=c(0.6,0.9,1,0.3,0,0,0,0,0,0,0,0),B=c(0.4,0.1,0,0.7,1,1,0.7,1,0,0,0,0),C=c(0,0,0,0,0,0,0.3,0,1,1,1,1))
> rownames(pred)<-paste("sample",1:12,sep="")
> #male sure the sum per row is 1 is both gs and pred
> apply(gs,1,sum)
 sample1  sample2  sample3  sample4  sample5  sample6  sample7  sample8 
       1        1        1        1        1        1        1        1 
 sample9 sample10 sample11 sample12 
       1        1        1        1 
> apply(pred,1,sum)
 sample1  sample2  sample3  sample4  sample5  sample6  sample7  sample8 
       1        1        1        1        1        1        1        1 
 sample9 sample10 sample11 sample12 
       1        1        1        1 
> #compute perfromance
> perfDSC(pred,gs)
NA in cutpts forces recomputation using smallest gap
NA in cutpts forces recomputation using smallest gap
NA in cutpts forces recomputation using smallest gap
      BCM      CCEM      AUPR       AUC 
0.8750000 0.8958333 0.9875000 0.9947917 
> 
> 
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>