Last data update: 2014.03.03

R: Cross-validation in main effect Cox AIM
cv.cox.mainR Documentation

Cross-validation in main effect Cox AIM

Description

Cross-validation for selecting the number of binary rules in the main effect AIM with survival outcomes.

Usage

cv.cox.main(x, y, status, K.cv=5, num.replicate=1, nsteps, mincut=0.1, backfit=F, maxnumcut=1, dirp=0)

Arguments

x

n by p matrix. The covariate matrix

y

n vector. The observed follow-up time

status

n 0/1 vector. The status indicator. 1=failure and 0=alive.

K.cv

K.cv-fold cross validation

num.replicate

number of independent replications of K-fold cross validations.

nsteps

the maximum number of binary rules to be included in the index

backfit

T/F. Whether the existing split points are adjusted after including new binary rules

mincut

the minimum cutting proportion for the binary rule at either end. It typically is between 0 and 0.2.

maxnumcut

the maximum number of binary splits per predictor

dirp

p vector. The given direction of the binary split for each of the p predictors. 0 represents "no pre-given direction"; 1 represents "(x>cut)"; -1 represents "(x<cut)". Alternatively, "dirp=0" represents that there is no pre-given direction for any of the predictor.

Details

cv.cox.main implements the K-fold cross-validation for the main effect Cox AIM. It estimates the partial likelihood score test statistics in the test set for testing the association between the survival time and index constructed using training data. It also provides pre-validated fits for each observation and pre-validated partial likelihood score test statistics. The output can be used to select the optimal number of binary rules.

Value

cv.cox.main returns

kmax

the optimal number of binary rules based the cross-validation

meanscore

nsteps-vector. The cross-validated partial likelihood score test statistics (significant at 0.05, if greater than 1.96) for the association between survival time and index.

pvfit.score

nsteps-vector. The pre-validated partial likelihood score test statistics (significant at 0.05, if greater than 1.96) for the association between survival time and index.

preval

nsteps by n matrix. Pre-validated fits for individual observation

References

L Tian and R Tibshirani Adaptive index models for marker-based risk stratification, Tech Report, available at http://www-stat.stanford.edu/~tibs/AIM.

R Tibshirani and B Efron, Pre-validation and inference in microarrays, Statist. Appl. Genet. Mol. Biol., 1:1-18, 2002.

Author(s)

Lu Tian and Robert Tibshirani

Examples

## generate data

set.seed(1)

n=200
p=10
x=matrix(rnorm(n*p), n, p)
z=(x[,1]<0.2)+(x[,5]>0.2)
beta=1
fail.time=rexp(n)*exp(-beta*z)
cen.time=rexp(n)*1.25
y=pmin(fail.time, cen.time)
y=round(y*10)/10
delta=1*(fail.time<cen.time)


## cross-validate the main effect Cox AIM 
a=cv.cox.main(x, y, delta, nsteps=10, K.cv=3, num.replicate=3)
 

## examine the test statistics in the test set 
par(mfrow=c(1,2))
plot(a$meanscore, type="l")
plot(a$pvfit.score, type="l")


## construct the index with the optimal number of binary rules 
k.opt=a$kmax
a=cox.main(x, y, delta, nsteps=k.opt)
print(a)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(AIM)
Loading required package: survival
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/AIM/cv.cox.main.Rd_%03d_medium.png", width=480, height=480)
> ### Name: cv.cox.main
> ### Title: Cross-validation in main effect Cox AIM
> ### Aliases: cv.cox.main
> 
> ### ** Examples
> 
> ## generate data
> 
> set.seed(1)
> 
> n=200
> p=10
> x=matrix(rnorm(n*p), n, p)
> z=(x[,1]<0.2)+(x[,5]>0.2)
> beta=1
> fail.time=rexp(n)*exp(-beta*z)
> cen.time=rexp(n)*1.25
> y=pmin(fail.time, cen.time)
> y=round(y*10)/10
> delta=1*(fail.time<cen.time)
> 
> 
> ## cross-validate the main effect Cox AIM 
> a=cv.cox.main(x, y, delta, nsteps=10, K.cv=3, num.replicate=3)
>  
> 
> ## examine the test statistics in the test set 
> par(mfrow=c(1,2))
> plot(a$meanscore, type="l")
> plot(a$pvfit.score, type="l")
> 
> 
> ## construct the index with the optimal number of binary rules 
> k.opt=a$kmax
> a=cox.main(x, y, delta, nsteps=k.opt)
> print(a)
$res
$res[[1]]
     jmax      cutp maxdir    maxsc
[1,]    5 0.2054208      1 6.700279

$res[[2]]
     jmax      cutp maxdir    maxsc
[1,]    5 0.2054208      1 6.700279
[2,]    1 0.1887923     -1 7.961128


$maxsc
[1] 6.700279 7.961128

> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>