R: A package for survival time prediction based on a piecewise...
RCASPAR-package
R Documentation
A package for survival time prediction based on a piecewise baseline hazard Cox regression model.
Description
The package is the R-version of the C-based software CASPAR (Kaderali,2006). It is meant to help predict survival times in the presence of high-dimensional explanatory
co-variates. The model is a piecewise baseline hazard Cox regression model with an Lq-norm based prior that selects for the most important regression coefficients, and in turn
the most relevant co-variates for survival analysis. It was primarily tried on gene expression and aCGH data, but can be used on any other type of high-dimensional data and in
disciplines other than biology and medicine.
Details
Package:
RCASPAR
Type:
Package
Version:
1.0
Date:
2010-08-23
License: GPL(>=3)
LazyLoad:
yes
Author(s)
Douaa Mugahid
Maintainer: Douaa Mugahid <mugahid@stud.uni-heidelberg.de>, Lars Kaderali <lars.kaderali@bioquant.uni-heidelberg.de>
References
The basic model is based on the Cox regression model as first introduced by Sir David Cox in: Cox,D.(1972).Regression models & life tables. Journal of the Royal Society
of Statistics, 34(2), 187-220.
The extension of the Cox model to its stepwise form was adapted from: Ibrahim, J.G, Chen, M.-H. & Sinha, D. (2005). Bayesian Survival Analysis (second ed.).
NY: Springer.
as well as Kaderali, Lars.(2006) A Heirarchial Bayesian Approach to Regression and its Application to Predicting Survival Times in Cancer Patients. Aachen: Shaker
The prior on the regression coefficients was adopted from: Mazur, J., Ritter,D.,Reinelt, G. & Kaderali, L. (2009). Reconstructing Non-Linear dynamic Models of Gene
Regulation using Stochastic Sampling. BMC Bioinformatics, 10(448).
Examples
## Eg.(1): A simple example performed with a training and validation set:
data(Bergamaschi)
data(survData)
## Generate prediction:
result <- STpredictor_BLH(geDataS=Bergamaschi[1:27, 1:2], survDataS=survData[1:27, 9:10], geDataT=Bergamaschi[28:82, 1:2], survDataT=survData[28:82, 9:10], q = 1, s = 1, a = 1.558, b = 0.179
, cut.off=15, groups = 3, method = "CG", noprior = 1, extras = list(reltol=1))
## Plot a KM plot with both long and short survivors:
kmplt_svrl(long=result$long_survivors, short=result$short_survivors,title="KM plot of long and short survivors")
## Determine the area under the curve of AUROC curves vs. time to see the performance of the predictor given the chosen parameters and the current partitioning into training
## and validation sets:
survivAURC(Stime=result$predicted_STs$True_STs,status=result$predicted_STs$censored, marker=result$predicted_STs$Predicted_STs, time.max=20)
## Perform a log-rank test to see if the difference between the long and short survivors is significant:
logrnk(dataL=result$long_survivors, dataS=result$short_survivors)
## Eg.(2): A simple example performed with cross validation:
data(Bergamaschi)
data(survData)
## Generate prediction:
STpredictor_xvBLH(geData=Bergamaschi[1:40,1:2], survData=survData[1:40,9:10], k = 10, cut.off = 10, q = 1, s = 1, a = 1.558, b = 0.179, groups = 3, method = "BFGS", noprior = 1, extras = list(reltol=1))
## Plot a KM plot with both long and short survivors:
kmplt_svrl(long=result$long_survivors, short=result$short_survivors,title="KM plot of long and short survivors")
## Determine the area under the curve of AUROC curves vs. time to see the performance of the predictor given the chosen parameters and the current partitioning into training
## and validation sets:
survivAURC(Stime=result$predicted_STs$True_STs,status=result$predicted_STs$censored, marker=result$predicted_STs$Predicted_STs, time.max=20)
## Perform a log-rank test to see if the difference between the long and short survivors is significant:
logrnk(dataL=result$long_survivors, dataS=result$short_survivors)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(RCASPAR)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/RCASPAR/RCASPAR-package.Rd_%03d_medium.png", width=480, height=480)
> ### Name: RCASPAR-package
> ### Title: A package for survival time prediction based on a piecewise
> ### baseline hazard Cox regression model.
> ### Aliases: RCASPAR-package RCASPAR
> ### Keywords: Piecewise baseline hazard Cox regression model survival
> ### analysis
>
> ### ** Examples
>
> ## Eg.(1): A simple example performed with a training and validation set:
> data(Bergamaschi)
> data(survData)
> ## Generate prediction:
> result <- STpredictor_BLH(geDataS=Bergamaschi[1:27, 1:2], survDataS=survData[1:27, 9:10], geDataT=Bergamaschi[28:82, 1:2], survDataT=survData[28:82, 9:10], q = 1, s = 1, a = 1.558, b = 0.179
+ , cut.off=15, groups = 3, method = "CG", noprior = 1, extras = list(reltol=1))
---------------Optimizing------------------
...........................
> ## Plot a KM plot with both long and short survivors:
> kmplt_svrl(long=result$long_survivors, short=result$short_survivors,title="KM plot of long and short survivors")
> ## Determine the area under the curve of AUROC curves vs. time to see the performance of the predictor given the chosen parameters and the current partitioning into training
> ## and validation sets:
> survivAURC(Stime=result$predicted_STs$True_STs,status=result$predicted_STs$censored, marker=result$predicted_STs$Predicted_STs, time.max=20)
$AUC
[1] 11.66154
$AUeachROC
[1] 0.5076426 0.7127124 0.6375423 0.6095430 0.6095430 0.6095430 0.6095430
[8] 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430
[15] 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430
> ## Perform a log-rank test to see if the difference between the long and short survivors is significant:
> logrnk(dataL=result$long_survivors, dataS=result$short_survivors)
$Xsq
[1] 0.7582287
$pValue
[1] 0.3838834
>
> ## Eg.(2): A simple example performed with cross validation:
> data(Bergamaschi)
> data(survData)
> ## Generate prediction:
> STpredictor_xvBLH(geData=Bergamaschi[1:40,1:2], survData=survData[1:40,9:10], k = 10, cut.off = 10, q = 1, s = 1, a = 1.558, b = 0.179, groups = 3, method = "BFGS", noprior = 1, extras = list(reltol=1))
Progress for group 1
---------------Optimizing------------------
.......
Progress for group 2
---------------Optimizing------------------
.......
Progress for group 3
---------------Optimizing------------------
...............................
Progress for group 4
---------------Optimizing------------------
........
Progress for group 5
---------------Optimizing------------------
.......
Progress for group 6
---------------Optimizing------------------
........
Progress for group 7
---------------Optimizing------------------
.......
Progress for group 8
---------------Optimizing------------------
.........................
Progress for group 9
---------------Optimizing------------------
........
Progress for group 10
---------------Optimizing------------------
.......
$predicted_STs
PatientOrderValidation True_STs Predicted_STs Absolute_Error censored
1 1 1.4166667 39.030772 37.614106 0
2 2 2.7500000 36.001739 33.251739 1
3 3 2.4166667 25.560606 23.143939 1
4 4 2.5833333 24.695553 22.112220 1
5 5 2.1666667 12.251305 10.084638 1
6 6 2.5000000 15.373521 12.873521 0
7 7 2.5000000 20.606804 18.106804 1
8 8 1.8333333 26.136561 24.303228 1
9 9 1.2500000 12.255812 11.005812 0
10 10 0.6666667 12.652276 11.985610 1
11 11 1.0000000 12.987651 11.987651 0
12 12 6.5833333 12.484618 5.901284 1
13 13 6.5000000 13.403160 6.903160 1
14 14 6.6666667 12.327331 5.660664 1
15 15 2.7500000 13.853202 11.103202 1
16 16 1.6666667 11.243977 9.577311 0
17 17 1.1666667 30.689279 29.522612 0
18 18 2.8333333 54.970975 52.137642 0
19 19 3.5833333 34.620992 31.037659 0
20 20 6.1666667 22.431978 16.265311 1
21 21 6.1666667 16.295538 10.128871 1
22 22 3.4166667 10.827320 7.410654 1
23 23 6.0833333 14.276327 8.192993 1
24 24 1.8333333 8.800746 6.967413 0
25 25 5.5833333 38.834027 33.250694 1
26 26 0.7500000 8.678266 7.928266 0
27 27 5.7500000 12.938162 7.188162 1
28 28 5.5000000 15.990608 10.490608 1
29 29 0.5833333 21.979914 21.396580 0
30 30 7.6666667 23.486206 15.819540 1
31 31 5.0000000 30.704349 25.704349 1
32 32 2.8333333 27.046630 24.213296 0
33 33 1.3333333 12.677999 11.344665 0
34 34 5.0833333 13.168304 8.084970 1
35 35 0.8333333 11.593089 10.759756 0
36 36 1.5000000 11.072555 9.572555 0
37 37 4.7500000 23.131348 18.381348 1
38 38 3.4166667 9.911739 6.495072 0
39 39 4.6666667 65.929331 61.262664 1
40 40 1.9166667 29.703152 27.786485 0
$short_survivors
PatientOrderValidation True_STs Predicted_STs Absolute_Error censored group
2 2 2.7500000 36.001739 33.251739 1 S
6 6 2.5000000 15.373521 12.873521 0 S
10 10 0.6666667 12.652276 11.985610 1 S
14 14 6.6666667 12.327331 5.660664 1 S
18 18 2.8333333 54.970975 52.137642 0 S
22 22 3.4166667 10.827320 7.410654 1 S
26 26 0.7500000 8.678266 7.928266 0 S
30 30 7.6666667 23.486206 15.819540 1 S
34 34 5.0833333 13.168304 8.084970 1 S
38 38 3.4166667 9.911739 6.495072 0 S
$long_survivors
PatientOrderValidation True_STs Predicted_STs Absolute_Error censored group
1 1 1.4166667 39.030772 37.614106 0 L
3 3 2.4166667 25.560606 23.143939 1 L
4 4 2.5833333 24.695553 22.112220 1 L
5 5 2.1666667 12.251305 10.084638 1 L
7 7 2.5000000 20.606804 18.106804 1 L
8 8 1.8333333 26.136561 24.303228 1 L
9 9 1.2500000 12.255812 11.005812 0 L
11 11 1.0000000 12.987651 11.987651 0 L
12 12 6.5833333 12.484618 5.901284 1 L
13 13 6.5000000 13.403160 6.903160 1 L
15 15 2.7500000 13.853202 11.103202 1 L
16 16 1.6666667 11.243977 9.577311 0 L
17 17 1.1666667 30.689279 29.522612 0 L
19 19 3.5833333 34.620992 31.037659 0 L
20 20 6.1666667 22.431978 16.265311 1 L
21 21 6.1666667 16.295538 10.128871 1 L
23 23 6.0833333 14.276327 8.192993 1 L
24 24 1.8333333 8.800746 6.967413 0 L
25 25 5.5833333 38.834027 33.250694 1 L
27 27 5.7500000 12.938162 7.188162 1 L
28 28 5.5000000 15.990608 10.490608 1 L
29 29 0.5833333 21.979914 21.396580 0 L
31 31 5.0000000 30.704349 25.704349 1 L
32 32 2.8333333 27.046630 24.213296 0 L
33 33 1.3333333 12.677999 11.344665 0 L
35 35 0.8333333 11.593089 10.759756 0 L
36 36 1.5000000 11.072555 9.572555 0 L
37 37 4.7500000 23.131348 18.381348 1 L
39 39 4.6666667 65.929331 61.262664 1 L
40 40 1.9166667 29.703152 27.786485 0 L
$weights
[1] -0.4492131 0.4886046
$baselineHs
[1] 0.2499189 0.0998820 0.0998820
> ## Plot a KM plot with both long and short survivors:
> kmplt_svrl(long=result$long_survivors, short=result$short_survivors,title="KM plot of long and short survivors")
> ## Determine the area under the curve of AUROC curves vs. time to see the performance of the predictor given the chosen parameters and the current partitioning into training
> ## and validation sets:
> survivAURC(Stime=result$predicted_STs$True_STs,status=result$predicted_STs$censored, marker=result$predicted_STs$Predicted_STs, time.max=20)
$AUC
[1] 11.66154
$AUeachROC
[1] 0.5076426 0.7127124 0.6375423 0.6095430 0.6095430 0.6095430 0.6095430
[8] 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430
[15] 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430 0.6095430
> ## Perform a log-rank test to see if the difference between the long and short survivors is significant:
> logrnk(dataL=result$long_survivors, dataS=result$short_survivors)
$Xsq
[1] 0.7582287
$pValue
[1] 0.3838834
>
>
>
>
>
> dev.off()
null device
1
>