R: Fitting the GORMC model with interval censored data.
GORMC
R Documentation
Fitting the GORMC model with interval censored data.
Description
The Generalized Odds Rate Mixture Cure model is fitted for interval censored survival data. The EM algorithm facilitated
by a gamma-poisson data augmentation is applied for estimating the coefficients in both the cure rate part and the survival
part. The covariance matrix has closed forms based on the Louis method.
Usage
GORMC(survfun = formula(data), curefun = formula(data), data = parent.frame(),
r = 0, n.int = 5, order = 3, max.iter = 1000, cov.rate = 0.001)
Arguments
survfun
A formula for the survival part in the GORMC model, defined using the Surv function and type=“interval2".
curefun
The formula of predictors of the cure rate part in the GORMC model.
data
The interval censored sorvival data, including the left and right end points of the time intervals and the covariates for
the cure rate part and the survival part. If a subject is left(right) censored, the left(right) end point of the subject
should be defined as “NA", see example.
r
The transformation parameter in the GORMC model, should be greater than or equal to 0. r=0 refers to the PHMC model and
r=1 refers to the POMC model. The default is 0.
n.int
Number of interior knots of the splines. Default is 5.
order
Order of the spline basis functions. Default is 3, i.e. the cubic splines.
max.iter
The maximum number of interations for the EM algorithm. Default is 1000.
cov.rate
The bound for convergence of the algorithm, which defined as the difference between the log-likelihood values of two
consecutive iterations smaller than this value. Default is 0.001.
Details
The formula defined for “survfun" is based on the Surv() function, where the left and right end points of the time interval
are included and the type is equal to “interval2". The left(right) end points of left(right) censored individuals should
be defined as “NA" in the data frame before running the function.
The transformation parameter r is a nonnegative number corresponding to a specific model in the GORMC family of models.
Special cases include the PHMC model(r=0) and the POMC model(r=1). Other positive numbers can also be specified. The
grid search method is suggested to find the best model in practice. That is, try a sequence of r values and choose the
one with the greatest log-likelihood value.
Value
ParEst
A list includes the estimated coefficients (Eta,Beta,gl), the whole hessian matrix (Hessian), AIC, and the
log-likelihood value(loglik).
ParVcov
The estimated covariance matrix of the coefficients Eta and Beta.
Note
The estimated hessian matrix can be very large and sometimes not invertable. In which case, we try the QR decomposition,
g-inverse or even numerical methods to get the covariance matrix. Different values of hess in the ParVcov indicating the
different cases.
hess=0:the hessian matrix is invertable;
hess=1:the QR decomposition is applied to solve the hessian matrix;
hess=2:the g-inverce is applied to the hessian matrix;
hess=3:the hessian matrix is obtained from numerical methods.
The variance estimates may be unreliable for the cases when hess>0.
References
Zhou, J., Zhang, J. and Lu, W. (2016+). EM Algorithm for the Generalized Odds Rate Mixture Cure Model with Interval
Censored Data. Submitted to Journal of Computational and Graphical Statistics.
Examples
data(Hemophilia)
head(Hemophilia)
# Set Left/Right Interval End Points as NA
Hemophilia$L[Hemophilia$d1==1]<-Hemophilia$R[Hemophilia$d3==1]<-NA
# Fit PHMC Model (r=0)
fit<-GORMC(survfun=Surv(L,R)~Low+Medium+High,curefun=~Low+Medium+High,
data=Hemophilia,r=0)
summary(fit)
# Fit POMC Model (r=1)
# fit<-GORMC(survfun=Surv(L,R)~Low+Medium+High,curefun=~Low+Medium+High,
# data=Hemophilia,r=1)
# summary(fit)
# Predict Cure Rate and Survival Curve for a New Individual
# Specify coveriate vectors for new.z and new.x
pred1<-predict(fit,new.z=c(1,0,0,0),new.x=c(0,0,0))
pred2<-predict(fit,new.z=c(1,1,0,0),new.x=c(1,0,0))
pred3<-predict(fit,new.z=c(1,0,1,0),new.x=c(0,1,0))
pred4<-predict(fit,new.z=c(1,0,0,1),new.x=c(0,0,1))
# Obtain Cure Rates
pred1$CureRate
pred2$CureRate
pred3$CureRate
pred4$CureRate
# Plot the Survival Curves
plot(pred1,xlab="Time",ylab="Survival Probability",ylim=c(0,1))
lines(pred2$Survival,col=2)
lines(pred3$Survival,col=3)
lines(pred4$Survival,col=4)
legend(0,0.3,c("None","Low","Medium","High"),lty=1,col=1:4)
# Not run: Grid Search r
# rr<-seq(0,2,0.2)
# logl<-numeric()
# for(i in 1:length(rr)){
# fit<-GORMC(survfun=Surv(L,R)~Low+Medium+High,curefun=~Low+Medium+High,
# data=Hemophilia,r=rr[i])
# logl[i]<-fit$ParEst$loglik
# }
# plot(rr,logl,type="l",xlab="r",ylab="Log-likelihood")
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GORCure)
Loading required package: survival
Loading required package: ICsurv
Loading required package: pracma
Loading required package: MASS
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/GORCure/GORMC.Rd_%03d_medium.png", width=480, height=480)
> ### Name: GORMC
> ### Title: Fitting the GORMC model with interval censored data.
> ### Aliases: GORMC
> ### Keywords: GORMC model Interval censoring
>
> ### ** Examples
>
> data(Hemophilia)
> head(Hemophilia)
d1 d2 d3 L R Low Medium High
1 0 1 0 7 20 1 0 0
2 0 1 0 9 12 0 0 1
3 0 1 0 9 20 1 0 0
4 1 0 0 0 25 1 0 0
5 0 0 1 57 0 1 0 0
6 0 1 0 23 26 1 0 0
> # Set Left/Right Interval End Points as NA
> Hemophilia$L[Hemophilia$d1==1]<-Hemophilia$R[Hemophilia$d3==1]<-NA
>
> # Fit PHMC Model (r=0)
> fit<-GORMC(survfun=Surv(L,R)~Low+Medium+High,curefun=~Low+Medium+High,
+ data=Hemophilia,r=0)
> summary(fit)
Call:
GORMC(survfun = Surv(L, R) ~ Low + Medium + High, curefun = ~Low +
Medium + High, data = Hemophilia, r = 0)
Estimate StdErr t.value p.value
INC: Intercept -1.93510 0.19706 -9.8197 < 2.2e-16 ***
INC: Low 2.18331 0.26462 8.2509 < 2.2e-16 ***
INC: Medium 4.37121 0.41752 10.4696 < 2.2e-16 ***
INC: High 5.09916 0.62149 8.2047 2.311e-16 ***
LAT: Low 0.61367 0.29006 2.1156 0.03438 *
LAT: Medium 1.19327 0.27945 4.2701 1.953e-05 ***
LAT: High 1.62453 0.30118 5.3939 6.895e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[1] "Loglik= -514.6"
>
> # Fit POMC Model (r=1)
> # fit<-GORMC(survfun=Surv(L,R)~Low+Medium+High,curefun=~Low+Medium+High,
> # data=Hemophilia,r=1)
> # summary(fit)
>
> # Predict Cure Rate and Survival Curve for a New Individual
> # Specify coveriate vectors for new.z and new.x
> pred1<-predict(fit,new.z=c(1,0,0,0),new.x=c(0,0,0))
> pred2<-predict(fit,new.z=c(1,1,0,0),new.x=c(1,0,0))
> pred3<-predict(fit,new.z=c(1,0,1,0),new.x=c(0,1,0))
> pred4<-predict(fit,new.z=c(1,0,0,1),new.x=c(0,0,1))
>
> # Obtain Cure Rates
> pred1$CureRate
[1] 0.8738124
> pred2$CureRate
[1] 0.4382626
> pred3$CureRate
[1] 0.0804603
> pred4$CureRate
[1] 0.04054054
>
> # Plot the Survival Curves
> plot(pred1,xlab="Time",ylab="Survival Probability",ylim=c(0,1))
> lines(pred2$Survival,col=2)
> lines(pred3$Survival,col=3)
> lines(pred4$Survival,col=4)
> legend(0,0.3,c("None","Low","Medium","High"),lty=1,col=1:4)
>
> # Not run: Grid Search r
> # rr<-seq(0,2,0.2)
> # logl<-numeric()
> # for(i in 1:length(rr)){
> # fit<-GORMC(survfun=Surv(L,R)~Low+Medium+High,curefun=~Low+Medium+High,
> # data=Hemophilia,r=rr[i])
> # logl[i]<-fit$ParEst$loglik
> # }
> # plot(rr,logl,type="l",xlab="r",ylab="Log-likelihood")
>
>
>
>
>
> dev.off()
null device
1
>