R: A wrapper function for Volume under Surface (VUS) estimate,...
VUS
R Documentation
A wrapper function for Volume under Surface (VUS) estimate, variance
estimate under normal and nonparametric assumption and sample size calculation
Description
A wrapper function to calculate the Volume under Surface (VUS)
estimate, its variance estimate and optimal cut-point, under normal
and nonparametric assumption, to provide partial VUS estimate with a minimum
requirement on the specificity and sensitivity under normality and to
calculate the sample size under normality to achieve a certain
estimation precision on VUS estimate.
A numeric vector, a diagnostic test's measurements in the D- (usually healthy subjects).
y
A numeric vector, a diagnostic test's measurements in the D0 (usually mildly diseased subjects).
z
A numeric vector, a diagnostic test's measurements in the D+ (usually severely diseased subjects).
method
a character argument, method =“Normal” or “NonPar”, estimate the VUS under
normality or nonparametrically.
p
A numeric value, the minimum required specificity, 0<=p<1, for calculation
partial volume under ROC surface. Default, p=0.
q
A numeric value, the minimum desired sensitivity,0<=q<1, for calculation
partial volume under ROC surface. Default, q=0. p=q=0 will give the
complete VUS estimate, otherwise give the partial VUS estimate
satisfying specificity no less than p and sensitivity no less than q.
alpha
A numeric value, (1-alpha)*100% Confidence interval of the VUS estimate
under normal assumption. Default, alpha=0.05.
NBOOT
A numeric value, the total number of bootstrapping used to estimate the
variance of nonparametric estimate of VUS.
subdivisions
A numeric value, the number of subintervals for integration using
adaptive quadrature in the R function
integrate. Default, subdivisions=50000.
lam.minus
A numeric value, the expected population proportion of the D_- group, used for
sample size calculation. Default, lam.minus=1/3. The proportions of the three ordinal
groups (lam.minus, lam0, lam.plus) should sum to 1.
lam0
A numeric value, the expected population proportion of the D_0 group, used for
sample size calculation. Default, lam0=1/3. The proportions of the three ordinal
groups (lam.minus, lam0, lam.plus) should sum to 1.
lam.plus
A numeric value, the expected population proportion of the D_+ group, used for
sample size calculation. Default, lam.plus=1/3.The proportions of the three ordinal
groups (lam.minus, lam0, lam.plus) should sum to 1.
typeIerror
A numeric value, (1-typeIerror)*100% confidence interval (CI) in sample size
calculation. Default typeIerror=0.05, i.e., calculate 95% CI.
margin
A numeric value, the margin of error on the VUS estimates in sample size
calculation. Default, margin=0.05. The (1-typeIerror)% CI on VUS
estimate under normality is
(VUS-Z_a*SE(VUS),VUS+Z_a*SE(VUS)), then margin=Z_a*SE(VUS) or half of
the CI's length, where Z_a is the normal quantile, Z_a=1.95 given
default typeIerror a=0.05.
FisherZ
Reference to the argument in Normal.VUS.
optimalCut
A logic value of TRUE or FALSE. If TRUE, the
function will return optimal cut-point from VUS analyses.
cut.seq
A sequence of numeric values from which the optimal cut-point
will be selected from, by default=NULL, will use the unique values
of the collection of x,y,z.
optimize
A logical value of TRUE or FALSE. If FALSE, take the
empirical optimal cut point identified by empirical search within the given
cut.seq as final reported optimal cut point.If TRUE, using the
empirical optimal cut point as starting point in optimization algorithm for final
optimal cut point.
...
Other arguments that can be passed to the R function
integrate, e.g., abs.tol, rel.tol, stop.on.error etc.
Details
For three ordinal group diagnostic test, there are two underlying
cut-point t_- and t_+ with t_-<t_+ based on which
patients are divided into the three ordinal groups. Patients
with a diagnostic test below t_- will be assigned to D^-;those
with the test above t_+ will be assigned to D^+ and the remaining fall
into D^0. Following the specificifity and sensitity definition as
in diagnostic test for two groups, we call the probability of the first
two events as specificity x=P_-{T ≤ t_-}=F_-(t_-) and
sensitivity.y=P_+{T > t_+}=1-F_+(t_+)=G_+(t_+) where the
P_i and F_+ denotes the probability density function and
cumulative density function of a diagnostic test in
D^i, i=-,0,+ separately. Then, the probability that a
patient randomly selected from the D^0 group has the test result
between the two cut-points can be expressed as,
z=P_0{t_- ≤ T ≤ t_+}=F_0(t_+)-F_0(t_-)=F_0(G_+^{-1}(y))-F_0(F_-^{-1}(x))
where the notation H^{-1}(.) denotes the inverse function of H
z is a function of the specificity and sensitivity, i.e.,
z=z(x,y), which constitutes a ROC surface in the three-dimensinal
space (x,y,z). The volume under the ROC surface (VUS) defined by z
can be written as,
The integration domain is D_{00}={0≤ x ≤ 1,0≤ y ≤
G_+(F_-^{-1}(x))}. The equation of partial VUS will be similar to the
above but the integration domain is D_{pq}={p≤ x ≤ 1,q ≤ y ≤
G_+(F_-^{-1}(x))}.
The optimal cut-points from VUS analyse are defined as the one
Value
A object of DiagTest3Grp with a list of components.
type
A character value, type=dQuoteVUS for VUS and type=dQuoteYouden for the extended Youden index, indicating which summary measure is outputted.
method
A character value. For VUS, method can be
“Normal” or “NonPar” (nonparametric); for Youden
index, choices are “Normal/TN/EMP/KS/KS-SJ”, indicating which
method is used to estimate the summary measure.
dat
A list of 3 components. Three components have names “x”,“y”,“z”, each recording the
inputted marker measurements (after removing NAs) under D^-,D^0,D^+ respectively.
dat.summary
A data frame with 3 rows (D-, D0,D+) and 3 columns (number of observations,mean, SD).
estimate
A numerical value. Point estimate for the summary
measure, either VUS or Youden.
variance
A numeric value. Variance on the summary measure
estimate. For normal method, output normal variance; for other methods output variance from bootstrapping.
CI
A named numeric vector of length 2. confidence interval on
the summary measure estimate, with name like 2.5%, 97.5% if
significance level is set to be 5%. For both VUS and the Youden
index, when normal method is in use, the CI is normal CI while
bootstrap method was used under other methods.
cut.point
A named numeric vector of length 2. optimal
cut-points with name “t.minus” for lower optimal cut point
and name “t.plus” for upper optimal point.
classify.prob
A named numeric vector of 3 values. Estimates on the
three group correct classification probabilities. specificity on D^-:
Sp==Pr(x≤ t_-|D^-); sensitivity on D^+:
Se=Pr(z≥ t_+|D^+); correct classification probability on
D^0: Sm=SPr(t_-<y<t_+|D^0). For VUS, it's empirical
estimation. For Youden index, depending on method adopted for the
Youden index estimate, the three probabilities will be estimated
using specified method.
sampleSize
A numeric value The sample size to estimate the summary measure within given margin of error and type-I error rate.See SampleSize.VUS and SampleSize.Youden3Grp.
alpha
A numeric value. The significance level for the CI
computation, e.g, default=5%.
typeIerror
A numeric value for type-I error rate, e.g.,default=5%.
margin
A numeric value. The margin of errors (precision) to
estimate the summary measure s.t. the half the length of the
resulting CI is equal to the given margin.
partialDeriv
A numeric data frame with one row and multiple
columns, containing relevant
parameters (a,b,c,d) and the partial derivatives of VUS estimate
w.r.t the relevant parameters which are outputted for performance
of statistical tests on markers under normal method or NA under
nonparametric method.
Warning
The bootstrapping to obtain the variance on the
nonparametric VUS estimate may take a while.
Note
Bug reports, malfunctioning, or suggestions for further improvements or
contributions can be sent to Jingqin Luo <rosy@wubios.wustl.edu>.
Author(s)
Jingqin Luo
References
Xiong, C. and van Belle, G. and Miller, J.P. and Morris, J.C. (2006)
Measuring and Estimating Diagnostic Accuracy When There
Are Three Ordinal Diagnostic Groups. Statistics In Medicine25
7 1251–1273.
Ferri, C. and Hernandez-Orallo, J. and Salido, M.A. (2003) Volume
under the ROC Surface for Multi-class Problems LECTURE NOTES IN
COMPUTER SCIENCE 108–120.
See Also
Normal.VUSNonParametric.VUSNonParametric.VUS.var
Examples
data(AL)
group <- AL$group
table(group)
##take the negated kfront marker measurements
kfront <- -AL$kfront
x <- kfront[group=="D-"]
y <- kfront[group=="D0"]
z <- kfront[group=="D+"]
##normal estimate
normal.res <- VUS(x,y,z,method="Normal",p=0,q=0,alpha=0.05)
normal.res
##nonparametric estimate
## Not run:
nonpar.res <- VUS(x,y,z,method="NonPar",p=0,q=0,alpha=0.05,NBOOT=100)
nonpar.res
## End(Not run)
## S3 method for class 'DiagTest3Grp':
print(normal.res)
## S3 method for class 'DiagTest3Grp':
plot(normal.res)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(DiagTest3Grp)
Loading required package: car
Loading required package: KernSmooth
KernSmooth 2.23 loaded
Copyright M. P. Wand 1997-2009
Loading required package: gplots
Attaching package: 'gplots'
The following object is masked from 'package:stats':
lowess
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/DiagTest3Grp/VUS.Rd_%03d_medium.png", width=480, height=480)
> ### Name: VUS
> ### Title: A wrapper function for Volume under Surface (VUS) estimate,
> ### variance estimate under normal and nonparametric assumption and
> ### sample size calculation
> ### Aliases: VUS
> ### Keywords: htest nonparametric design
>
> ### ** Examples
>
>
> data(AL)
> group <- AL$group
> table(group)
group
D- D0 D+
45 44 29
>
> ##take the negated kfront marker measurements
> kfront <- -AL$kfront
>
> x <- kfront[group=="D-"]
> y <- kfront[group=="D0"]
> z <- kfront[group=="D+"]
>
> ##normal estimate
> normal.res <- VUS(x,y,z,method="Normal",p=0,q=0,alpha=0.05)
> normal.res
The DiagTest3Grp summary measure: VUS
Method used for VUS:Normal
Raw Data Summary:
n mu sd
D- 45 -2.8657503 1.776514
D0 43 -0.3725226 2.212393
D+ 21 2.6817010 2.066669
VUS=0.6568,95% CI=0.5491~0.7646
Best cut-points: lower=-1.6826, upper=0.9119
The group correct classification probabilities are:
Sp Sm Se
0.7556 0.5581 0.7619
Sample Size to estimate VUS within specified margin of error=154
>
> ##nonparametric estimate
> ## Not run:
> ##D nonpar.res <- VUS(x,y,z,method="NonPar",p=0,q=0,alpha=0.05,NBOOT=100)
> ##D nonpar.res
> ## End(Not run)
>
> ## S3 method for class 'DiagTest3Grp':
> print(normal.res)
The DiagTest3Grp summary measure: VUS
Method used for VUS:Normal
Raw Data Summary:
n mu sd
D- 45 -2.8657503 1.776514
D0 43 -0.3725226 2.212393
D+ 21 2.6817010 2.066669
VUS=0.6568,95% CI=0.5491~0.7646
Best cut-points: lower=-1.6826, upper=0.9119
The group correct classification probabilities are:
Sp Sm Se
0.7556 0.5581 0.7619
Sample Size to estimate VUS within specified margin of error=154
>
> ## S3 method for class 'DiagTest3Grp':
> plot(normal.res)
>
>
>
>
>
>
> dev.off()
null device
1
>