R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Sample size and power computation for ROC curves

power.roc.test

R Documentation

Sample size and power computation for ROC curves

Description

Computes sample size, power, significance level or minimum AUC for ROC curves.

Usage

power.roc.test(...)
# One or Two ROC curves test with roc objects:
## S3 method for class 'roc'
power.roc.test(roc1, roc2, sig.level = 0.05, 
power = NULL, alternative = c("two.sided", "one.sided"), 
reuse.auc=TRUE, method = c("delong", "bootstrap", "obuchowski"), ...)
# One ROC curve with a given AUC:
## S3 method for class 'numeric'
power.roc.test(auc = NULL, ncontrols = NULL, 
ncases = NULL, sig.level = 0.05, power = NULL, kappa = 1, 
alternative = c("two.sided", "one.sided"), ...)
# Two ROC curves with the given parameters:
## S3 method for class 'list'
power.roc.test(parslist, ncontrols = NULL, 
ncases = NULL, sig.level = 0.05, power = NULL,  kappa = 1, 
alternative = c("two.sided", "one.sided"), ...)

Arguments

`roc1, roc2`	one or two “roc” object from the `roc` function.
`auc`	expected AUC.
`parslist`	a `list` of parameters for the two ROC curves test with Obuchowski variance when no empirical ROC curve is known: A1 binormal A parameter for ROC curve 1 B1 binormal B parameter for ROC curve 1 A2 binormal A parameter for ROC curve 2 B2 binormal B parameter for ROC curve 2 rn correlation between the variables in control patients ra correlation between the variables in case patients delta the difference of AUC between the two ROC curves For a partial AUC, the following additional parameters must be set: FPR11 Upper bound of FPR (1 - specificity) of ROC curve 1 FPR12 Lower bound of FPR (1 - specificity) of ROC curve 1 FPR21 Upper bound of FPR (1 - specificity) of ROC curve 2 FPR22 Lower bound of FPR (1 - specificity) of ROC curve 2
`ncontrols, ncases`	number of controls and case observations available.
`sig.level`	expected significance level (probability of type I error).
`power`	expected power of the test (1 - probability of type II error).
`kappa`	expected balance between control and case observations. Must be positive. Only for sample size determination, that is to determine `ncontrols` and `ncases`.
`alternative`	whether a one or two-sided test is performed.
`reuse.auc`	if `TRUE` (default) and the “roc” objects contain an “auc” field, re-use these specifications for the test. See the AUC specification section for more details.
`method`	the method to compute variance and covariance, either “delong”, “bootstrap” or “obuchowski”. The first letter is sufficient. Only for Two ROC curves power calculation. See `var` and `cov` documentations for more details.
`...`	further arguments passed to or from other methods, especially `auc` (with `reuse.auc=FALSE` or no AUC in the ROC curve), `cov` and `var` (especially arguments `method`, `boot.n` and `boot.stratified`). Ignored (with a warning) with a `parslist`.

Value

An object of class power.htest (such as that given by power.t.test) with the supplied and computed values.

One ROC curve power calculation

If one or no ROC curves are passed to power.roc.test, a one ROC curve power calculation is performed. The function expects either power, sig.level or auc, or both ncontrols and ncases to be missing, so that the parameter is determined from the others with the formula by Obuchowski et al., 2004 (formulas 2 and 3, p. 1123).

For the sample size, ncases is computed directly from formulas 2 and 3 and ncontrols is deduced with kappa. AUC is optimized by uniroot while sig.level and power are solved as quadratic equations.

power.roc.test can also be passed a roc object from the roc function, but the empirical ROC will not be used, only the number of patients and the AUC.

Two paired ROC curves power calculation

If two ROC curves are passed to power.roc.test, the function will compute either the required sample size (if power is supplied), the significance level (if sig.level=NULL and power is supplied) or the power of a test of a difference between to AUCs according to the formula by Obuchowski and McClish, 1997et al. (formulas 2 and 3, p. 1530–1531). The null hypothesis is that the AUC of roc1 is the same than the AUC of roc2, with roc1 taken as the reference ROC curve.

For the sample size, ncases is computed directly from formula 2 and ncontrols is deduced from the ratio observed in roc1 and roc2. sig.level and power are solved as quadratic equations.

The variance and covariance of the ROC curve are computed with the var and cov functions. By default, DeLong method is used for full AUCs and the bootstrap for partial AUCs. It is possible to force the use of Obuchowski's variance by specifying method="obuchowski".

Alternatively when no empirical ROC curve is known, or if only one is available, a list can be passed to power.roc.test, with the contents defined in the “Arguments” section. The variance and covariance are computed from Table 1 and Equation 4 and 5 of Obuchowski and McClish (1997), p. 1530–1531.

Power calculation for unpaired ROC curves is not implemented.

AUC specification

The comparison of the AUC of the ROC curves needs a specification of the AUC. The specification is defined by:

the “auc” field in the “roc” objects if reuse.auc is set to TRUE (default)
passing the specification to auc with ... (arguments partial.auc, partial.auc.correct and partial.auc.focus). In this case, you must ensure either that the roc object do not contain an auc field (if you called roc with auc=FALSE), or set reuse.auc=FALSE.

If reuse.auc=FALSE the auc function will always be called with ... to determine the specification, even if the “roc” objects do contain an auc field.

As well if the “roc” objects do not contain an auc field, the auc function will always be called with ... to determine the specification.

Warning: if the roc object passed to roc.test contains an auc field and reuse.auc=TRUE, auc is not called and arguments such as partial.auc are silently ignored.

Acknowledgements

The authors would like to thank Christophe Combescure and Anne-Sophie Jannot for their help with the implementation of this section of the package.

References

Nancy A. Obuchowski, Donna K. McClish (1997). “Sample size determination for diagnostic accurary studies involving binormal ROC curve indices”. Statistics in Medicine, 16, 1529–1542. DOI: 10.1002/(SICI)1097-0258(19970715)16:13<1529::AID-SIM565>3.0.CO;2-H.

Nancy A. Obuchowski, Micharl L. Lieber, Frank H. Wians Jr. (2004). “ROC Curves in Clinical Chemistry: Uses, Misuses, and Possible Solutions”. Clinical Chemistry, 50, 1118–1125. DOI: 10.1373/clinchem.2004.031823.

Examples

data(aSAH)

#### One ROC curve ####

# Build a roc object:
rocobj <- roc(aSAH$outcome, aSAH$s100b)

# Determine power of one ROC curve:
power.roc.test(rocobj)
# Same as:
power.roc.test(ncases=41, ncontrols=72, auc=0.73, sig.level=0.05)
# sig.level=0.05 is implicit and can be omitted:
power.roc.test(ncases=41, ncontrols=72, auc=0.73)

# Determine ncases & ncontrols:
power.roc.test(auc=rocobj$auc, sig.level=0.05, power=0.95, kappa=1.7)
power.roc.test(auc=0.73, sig.level=0.05, power=0.95, kappa=1.7)

# Determine sig.level:
power.roc.test(ncases=41, ncontrols=72, auc=0.73, power=0.95, sig.level=NULL)

# Derermine detectable AUC:
power.roc.test(ncases=41, ncontrols=72, sig.level=0.05, power=0.95)


#### Two ROC curves ####

###  Full AUC
roc1 <- roc(aSAH$outcome, aSAH$ndka)
roc2 <- roc(aSAH$outcome, aSAH$wfns)

## Sample size
# With DeLong variance (default)
power.roc.test(roc1, roc2, power=0.9)
# With Obuchowski variance
power.roc.test(roc1, roc2, power=0.9, method="obuchowski")

## Power test
# With DeLong variance (default)
power.roc.test(roc1, roc2)
# With Obuchowski variance
power.roc.test(roc1, roc2, method="obuchowski")

## Significance level
# With DeLong variance (default)
power.roc.test(roc1, roc2, power=0.9, sig.level=NULL)
# With Obuchowski variance
power.roc.test(roc1, roc2, power=0.9, sig.level=NULL, method="obuchowski")

### Partial AUC
roc3 <- roc(aSAH$outcome, aSAH$ndka, partial.auc=c(1, 0.9))
roc4 <- roc(aSAH$outcome, aSAH$wfns, partial.auc=c(1, 0.9))

## Sample size
# With bootstrap variance (default)
## Not run: 
power.roc.test(roc3, roc4, power=0.9)

## End(Not run)
# With Obuchowski variance
power.roc.test(roc3, roc4, power=0.9, method="obuchowski")

## Power test
# With bootstrap variance (default)
## Not run: 
power.roc.test(roc3, roc4)
# This is exactly equivalent:
power.roc.test(roc1, roc2, reuse.auc=FALSE, partial.auc=c(1, 0.9))

## End(Not run)
# With Obuchowski variance
power.roc.test(roc3, roc4, method="obuchowski")

## Significance level
# With bootstrap variance (default)
## Not run: 
power.roc.test(roc3, roc4, power=0.9, sig.level=NULL)

## End(Not run)
# With Obuchowski variance
power.roc.test(roc3, roc4, power=0.9, sig.level=NULL, method="obuchowski")

## With only binormal parameters given
# From example 2 of Obuchowski and McClish, 1997.
ob.params <- list(A1=2.6, B1=1, A2=1.9, B2=1, rn=0.6, ra=0.6, FPR11=0,
FPR12=0.2, FPR21=0, FPR22=0.2, delta=0.037) 

power.roc.test(ob.params, power=0.8, sig.level=0.05)
power.roc.test(ob.params, power=0.8, sig.level=NULL, ncases=107)
power.roc.test(ob.params, power=NULL, sig.level=0.05, ncases=107)