R: Simulate 1 dataframe (1 simulation) of time-dependent...
tdSim.clst
R Documentation
Simulate 1 dataframe (1 simulation) of time-dependent exposure
under method 1 with a clustering data frame
Description
This function allows the user to input a data frame with clustering
parameters and generates a simulated dataset with time-dependent exposure.
In particular, the output dataset has a column corresponding to the physician site id, which will be used as a clustering variable in the Cox regression model in power calculation.
Length of the study in Months. The default value is 24 (months)
lambda
Scale parameter of the Weibull distribution, which is calculated as log(2) / median time to event for control group
rho
Shape parameter of the Weibull distribution, which is defaulted as 1, as we generate survival times by using the exponential distribution
beta
A numeric value that represents the exposure effect, which is the
regression coefficient (log hazard ratio) that represent the magnitude of
the relationship between the exposure covariate and the risk of an event
rateC
Rate of the exponential distribution to generate censoring times, which is calculated as log(2) / median time to censoring
df
A user-specified n (n 3) by 3 clustering data frame with columns corresponding to cat_id (category id, which is the physician site id. It can be either text strings or integers), cat_prop (category proportion, which is the proportion of subjects in corresponding a category id), and cat_exprate (category exposure rate, which is the exposure proportion corresponding to a category id). n rows corresponds to n different physician sites
prop.fullexp
A numeric value in interval [0, 1) that represents the proportion of exposed subjects that are fully exposed from the beginning to the end of the study. The default value is 0, which means all exposed subjects have an exposure status transition at some point during the study
maxrelexptime
A numeric value in interval (0, 1] that represents the maximum relative exposure time. Suppose this value is p, the exposure time for each subject is then uniformly distributed from 0 to p*subject's time in the study. The default value is 1, which means all exposed subjects have an exposure status transition at any point during the time in study.
min.futime
A numeric value that represents minimum follow-up time (in months). The default value is 0, which means no minimum follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study
min.postexp.futime
A numeric value that represents minimum post-exposure follow-up time (in months). The default value is 0, which means no minimum post-exposure follow-up time is considered. If it has a positive value, this argument will help exclude subjects that only spend a short amount of time in the study after their exposure
Details
The current version of this function allows the user to input a data frame with at
least 3 categories of physician sites, because the function uses a multinomial
distribution to assign subjects into each category according to the corresponding category proportion
Value
A data.frame object with columns corresponding to
id
Integer that represents a subject's identification number
start
For counting process formulation. Represents the start of each time interval
stop
For counting process formulation. Represents the end of each time interval
status
Indicator of event. status = 1 when event occurs and 0 otherwise
x
Indicator of exposure. x = 1 when exposed and 0 otherwise
clst_id
For clustering in the Cox proportional hazard model. Represents label of each subject's corresponding physician site
# Create a clustering data frame as input with 3 categories and a 20% weighted
# exposure proportion.
input_df <- data.frame(cat_id = c('lo', 'med', 'hi'),
cat_prop = c(0.65, 0.2, 0.15), cat_exp.prop = c(0.1, 0.3, 0.5))
# Simulate a dataset of 600 subjects with time-dependent exposure. Consider
# both minimum follow-up time (4 months) and minimum post-exposure follow-up
# time (4 months). Also consider a quick exposure after entering the study for
# each exposed subject. Set the maximum relative exposure time to be 1/6.
# Set the duration of the study to be 24 months; the median time to event for
# control group to be 24 months; exposure effect to be 0.3; median time to
# censoring to be 14 months.
df_tdclst <- tdSim.clst(N = 600, duration = 24, lambda = log(2)/24, rho = 1,
beta = 0.3, rateC = log(2)/14, df = input_df, prop.fullexp = 0,
maxrelexptime = 1/6, min.futime = 4, min.postexp.futime = 4)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(SimHaz)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/SimHaz/tdSim.clst.Rd_%03d_medium.png", width=480, height=480)
> ### Name: tdSim.clst
> ### Title: Simulate 1 dataframe (1 simulation) of time-dependent exposure
> ### under method 1 with a clustering data frame
> ### Aliases: tdSim.clst
> ### Keywords: Simulation
>
> ### ** Examples
>
> # Create a clustering data frame as input with 3 categories and a 20% weighted
> # exposure proportion.
>
> input_df <- data.frame(cat_id = c('lo', 'med', 'hi'),
+ cat_prop = c(0.65, 0.2, 0.15), cat_exp.prop = c(0.1, 0.3, 0.5))
>
> # Simulate a dataset of 600 subjects with time-dependent exposure. Consider
> # both minimum follow-up time (4 months) and minimum post-exposure follow-up
> # time (4 months). Also consider a quick exposure after entering the study for
> # each exposed subject. Set the maximum relative exposure time to be 1/6.
>
> # Set the duration of the study to be 24 months; the median time to event for
> # control group to be 24 months; exposure effect to be 0.3; median time to
> # censoring to be 14 months.
>
> df_tdclst <- tdSim.clst(N = 600, duration = 24, lambda = log(2)/24, rho = 1,
+ beta = 0.3, rateC = log(2)/14, df = input_df, prop.fullexp = 0,
+ maxrelexptime = 1/6, min.futime = 4, min.postexp.futime = 4)
>
>
>
>
>
> dev.off()
null device
1
>