Performance measure to use for the evaluation. A
complete list of the performance measures that are available for measure
and x.measure is given in the 'Details' section.
x.measure
A second performance measure. If different from the
default, a two-dimensional curve, with x.measure taken to be the
unit in direction of the x axis, and measure to be the unit in
direction of the y axis, is created. This curve is parametrized with
the cutoff.
...
Optional arguments (specific to individual performance
measures).
Details
Here is the list of available performance measures. Let Y and
Yhat be random variables representing the class and the prediction for
a randomly drawn sample, respectively. We denote by
+ and - the positive and
negative class, respectively. Further, we use the following
abbreviations for empirical quantities: P (# positive
samples), N (# negative samples), TP (# true positives), TN (# true
negatives), FP (# false positives), FN (# false negatives).
Rate of positive predictions. P(Yhat = +). Estimated as: (TP+FP)/(TP+FP+TN+FN).
rnp:
Rate of negative predictions. P(Yhat = -). Estimated as: (TN+FN)/(TP+FP+TN+FN).
phi:
Phi correlation coefficient. (TP*TN -
FP*FN)/(sqrt((TP+FN)*(TN+FP)*(TP+FP)*(TN+FN))). Yields a
number between -1 and 1, with 1 indicating a perfect
prediction, 0 indicating a random prediction. Values below 0
indicate a worse than random prediction.
mat:
Matthews correlation coefficient. Same as phi.
mi:
Mutual information. I(Yhat, Y) := H(Y) - H(Y | Yhat), where H is the
(conditional) entropy. Entropies are estimated naively (no bias
correction).
chisq:
Chi square test statistic. ?chisq.test
for details. Note that R might raise a warning if the sample size
is too small.
odds:
Odds ratio. (TP*TN)/(FN*FP). Note that odds ratio produces
Inf or NA values for all cutoffs corresponding to FN=0 or
FP=0. This can substantially decrease the plotted cutoff region.
lift:
Lift
value. P(Yhat = + |
Y = +)/P(Yhat = +).
f:
Precision-recall F measure (van Rijsbergen, 1979). Weighted
harmonic mean of precision (P) and recall (R). F = 1/
(alpha*1/P + (1-alpha)*1/R). If
alpha=1/2, the mean is balanced. A
frequent equivalent formulation is
F = (beta^2+1) * P * R / (R + beta^2 * P). In this formulation, the mean is
balanced if beta=1. Currently, ROCR only accepts the
alpha version
as input (e.g. alpha=0.5). If no value for alpha is given, the mean will be
balanced by default.
rch:
ROC convex hull. A ROC (=tpr vs fpr) curve with concavities
(which represent suboptimal choices of cutoff) removed (Fawcett 2001). Since the
result is already a parametric performance curve, it cannot be
used in combination with other measures.
auc:
Area under the ROC curve. This is equal to the value of the
Wilcoxon-Mann-Whitney test statistic and also the probability that the
classifier will score are randomly drawn positive sample higher than a
randomly drawn negative sample. Since the output of
auc is cutoff-independent, this
measure cannot be combined with other measures into a parametric
curve. The partial area under the ROC curve up to a given false
positive rate can be calculated by passing the optional parameter
fpr.stop=0.5 (or any other value between 0 and 1) to performance.
prbe:
Precision-recall break-even point. The cutoff(s) where
precision and recall are equal. At this point, positive and negative
predictions are made at the same rate as their prevalence in the
data. Since the output of
prbe is just a cutoff-independent scalar, this
measure cannot be combined with other measures into a parametric curve.
cal:
Calibration error. The calibration error is the
absolute difference between predicted confidence and actual reliability. This
error is estimated at all cutoffs by sliding a window across the
range of possible cutoffs. The default window size of 100 can be
adjusted by passing the optional parameter window.size=200
to performance. E.g., if for several
positive samples the output of the classifier is around 0.75, you might
expect from a well-calibrated classifier that the fraction of them
which is correctly predicted as positive is also around 0.75. In a
well-calibrated classifier, the probabilistic confidence estimates
are realistic. Only for use with
probabilistic output (i.e. scores between 0 and 1).
mxe:
Mean cross-entropy. Only for use with
probabilistic output. MXE := - 1/(P+N) ∑_{y_i=+}
ln(yhat_i) + ∑_{y_i=-} ln(1-yhat_i). Since the output of
mxe is just a cutoff-independent scalar, this
measure cannot be combined with other measures into a parametric curve.
rmse:
Root-mean-squared error. Only for use with
numerical class labels. RMSE := sqrt(1/(P+N) ∑_i (y_i -
yhat_i)^2). Since the output of
rmse is just a cutoff-independent scalar, this
measure cannot be combined with other measures into a parametric curve.
sar:
Score combinining performance measures of different
characteristics, in the attempt of creating a more "robust"
measure (cf. Caruana R., ROCAI2004):
SAR = 1/3 * ( Accuracy + Area under the ROC curve + Root
mean-squared error ).
ecost:
Expected cost. For details on cost curves,
cf. Drummond&Holte 2000,2004. ecost has an obligatory x
axis, the so-called 'probability-cost function'; thus it cannot be
combined with other measures. While using ecost one is
interested in the lower envelope of a set of lines, it might be
instructive to plot the whole set of lines in addition to the lower
envelope. An example is given in demo(ROCR).
cost:
Cost of a classifier when
class-conditional misclassification costs are explicitly given.
Accepts the optional parameters cost.fp and
cost.fn, by which the costs for false positives and
negatives can be adjusted, respectively. By default, both are set
to 1.
Value
An S4 object of class performance.
Note
Here is how to call 'performance' to create some standard
evaluation plots:
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(ROCR)
Loading required package: gplots
Attaching package: 'gplots'
The following object is masked from 'package:stats':
lowess
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/ROCR/performance.Rd_%03d_medium.png", width=480, height=480)
> ### Name: performance
> ### Title: Function to create performance objects
> ### Aliases: performance
> ### Keywords: classif
>
> ### ** Examples
>
> ## computing a simple ROC curve (x-axis: fpr, y-axis: tpr)
> library(ROCR)
> data(ROCR.simple)
> pred <- prediction( ROCR.simple$predictions, ROCR.simple$labels)
> perf <- performance(pred,"tpr","fpr")
> plot(perf)
>
> ## precision/recall curve (x-axis: recall, y-axis: precision)
> perf1 <- performance(pred, "prec", "rec")
> plot(perf1)
>
> ## sensitivity/specificity curve (x-axis: specificity,
> ## y-axis: sensitivity)
> perf1 <- performance(pred, "sens", "spec")
> plot(perf1)
>
>
>
>
>
> dev.off()
null device
1
>