drgee performs O-estimation,
E-estimation or DR-estimation
given symbolic representations of an outcome nuisance model and
an exposure nuisance model. For clustered data the nuisance models may
have cluster-specific intercepts.
The outcome as variable or as a character string naming a variable
in the data argument. If missing, the outcome is assumed
to be the response of oformula.
exposure
The exposure as variable or as a character string naming a variable
in the data argument. If missing, the exposure is assumed
to be the response of eformula.
oformula
An expression or formula for the outcome nuisance model.
eformula
An expression or formula for the exposure nuisance model.
iaformula
An expression or formula where the RHS should contain the variables
that "interact" (i.e. are supposed to be multiplied with) with the
exposure in the main model. "1" will always added. Default value is no
interactions, i.e. iaformula = formula(~1).
olink
A character string naming the link function in the outcome nuisance
model. Has to be "identity", "log" or
"logit".Default is "identity".
elink
A character string naming the link function in the exposure nuisance
model. Has to be "identity", "log" or
"logit". Default is "identity".
data
A data frame or environment containing the variables used.
If missing, variables are expected to be found in the
calling environment of the calling environment.
estimation.method
A character string naming the desired estimation method. Choose
"o" for O-estimation,
"e" for E-estimation or
"dr" for DR-estimation. Default is "dr".
cond
A logical value indicating whether the nuisance models should have
cluster-specific intercepts. Requires a clusterid argument.
rootFinder
A function to solve a system of non-linear equations. Default
is findRoots.
clusterid
A cluster-defining variable or a character string naming a
cluster-defining variable in the data argument. If it is not
found in the data argument, it will be searched for in the
calling frame. If missing, each observation will be considered to be
a separate cluster. This argument is required when cond = TRUE.
...
Further arguments to be passed to the function rootFinder.
Details
drgee estimates the parameter beta in a main
model g{E(Y|A,L)-g{E(Y|A,L)}=beta^T (A * X(L))},
where Y is the outcome of interest, A is the exposure of
interest, and L is a vector of covariates that we wish to
adjust for. X(L) is a vector valued function of L. Note that A*X(L) should be interpreted as a columnwise
multiplication and that X(L) will always contain a column of 1's.
Given a specification of an outcome nuisance model g{E(Y|A=0,L)=gamma^T V(L) (where V(L) is a function of L)
O-estimation is performed. Alternatively, leaving g{E(Y|A=0,L)
unspecified and using an exposure nuisance model h{E(Y|L)}=alpha^T Z(L) (where h is a link
function and Z(L) is a function of L) E-estimation
is performed. When g is logit, the exposure nuisance
model is required be of the form
logit{E(A|Y=0,L)}=alpha^T Z(L).
In this case the exposure needs to binary.
Given both an outcome and an exposure nuisance model, DR-estimation can be
performed. DR-estimation gives a consistent estimate of the parameter
beta when either the outcome nuisance model or
the exposure nuisance model
is correctly specified, not necessarily both.
Usage is best explained through an example. Suppose that we are
interested in the parameter vector (beta_0,beta_1) in a main model
logit{E(Y|A,L_1,L_2)-logit{E(Y|A=0,L_1,L_2)}=beta_0 A + beta_1 A *
L_1 where L_1 and L_2 are the covariates that we wish
to adjust for. To adjust for L_1 and L_2, we can use an outcome
nuisance model E(Y|A=0,L_1,L_2;gamma_0, gamma_1)=gamma_0 + gamma_1 L_1 or an
exposure nuisance model logit{E(A|Y=0,L_1,L_2)}=alpha_0+alpha_1
L_1+alpha_2 L_2 to calculate estimates of beta_0 and beta_1
in the main model. We specify the outcome nuisance model as oformula=Y~L_1
and olink = "logit". The exposure nuisance model is specified as
eformula = A~L_1+L_2 and elink = "logit".
Since the outcome Y and the exposure A are
identified as the LHS of oformula and eformla
respectively and since the outcome link is specified in the
olink argument,
the only thing left to specify for the main model is the
(multiplicative) interactions A X(L)=A (1,L_1)^T. This
is done by specifying X(L) as
iaformula = ~L_1, since 1 is always included in X(L).
We can then perform O-estimation, E-estimation or DR-estimation by
setting estimation.method to "o",
"e" or "dr" respectively. O-estimation uses only the
outcome nuisance model, and E-estimation uses only the exposure
nuisance model. DR-estimation uses both nuisance models, and gives a
consistent estimate of (beta_0,beta_1) if either nuisance model is correct, not necessarily both.
When estimation.method = "o", the RHS of eformula will be
ignored. The eformula argument can also be replaced by an exposure
argument specifying what the exposure of interest is.
When estimation.method = "e", the RHS of oformula will be
ignored. The oformula argument can also be replaced by an outcome
argument specifying what the outcome of interest is.
When cond = TRUE the nuisance models will be assumed to have
cluster-specific intercept. These intercepts will not estimated.
When E-estimation or DR-estimation is chosen with
olink = "logit", the exposure link will be
changed to "logit". Note that this choice
of outcome link does not work for DR-estimation
when cond = TRUE.
Robust variance for the estimated parameter is calculated
using the function robVcov. A cluster robust variance is calculated when
a character string naming a cluster variable is
supplied in the clusterid argument.
For E-estimation when cond = FALSE and g is the identity
or log link, see Robins et al. (1992).
For DR-estimation when cond = TRUE and g is the identity
or log link, see Robins (1999). For DR-estimation when
g is the logit link, see Tchetgen et al. (2010).
O-estimation can also be performed using the gee function.
Value
drgee returns an object of class drgee containing:
coefficients
Estimates of the parameters in the main model.
vcov
Robust variance for all main model parameters.
coefficients.all
Estimates of all estimated parameters.
vcov.all
Robust variance of the all parameter estimates.
optim.object
An estimation object returned from the function specified
in the rootFinder, if this function is called for the
estimation of the main model parameters.
optim.object.o
An estimation object returned from the function specified
in the rootFinder, if this function is called for the
estimation of the outcome nuisance parameters.
optim.object.e
An estimation object returned from the function specified
in the rootFinder, if this function is called for the
estimation of the outcome nuisance parameters.
call
The matched call.
estimation.method
The value of the input argument estimation.method.
The class methods coef and vcov can be used to extract
the estimated parameters and their covariance matrix from a
drgee object. summary.drgee produces a summary of the
calculations.
Author(s)
Johan Zetterqvist, Arvid Sjölander
References
Orsini N., Belocco R., Sjölander A. (2013), Doubly
Robust Estimation in Generalized Linear Models, Stata Journal,
13, 1, pp. 185–205
Robins J.M., Mark S.D., Newey W.K. (1992), Estimating Exposure
Effects by Modelling the Expectation of Exposure Conditional
on Confounders, Biometrics, 48, pp. 479–495
Robins JM (1999), Robust Estimation in Sequentially Ignorable
Missing Data and Causal Inference Models, Proceedings of the
American Statistical Association Section on Bayesian Statistical
Science, pp. 6–10
Tchetgen E.J.T., Robins J.M., Rotnitzky A. (2010), On Doubly Robust
Estimation in a Semiparametric Odds Ratio Model, Biometrika,
97,
1, 171–180
See Also
gee for O-estimation, findRoots for
nonlinear equation solving and robVcov for
estimation of variance.
Examples
## DR-estimation when
## the main model is
## E(Y|A,L1,L2)-E(Y|A=0,L1,L2)=beta0*A+beta1*A*L1
## and the outcome nuisance model is
## E(Y|A=0,L1,L2)=gamma0+gamma1*L1+gamma2*L2
## and the exposure nuisance model is
## E(A|Y=0,L1,L2)=expit(alpha0+alpha1*L1+alpha2*l2)
library(drgee)
expit<-function(x) exp(x)/(1+exp(x))
n<-5000
## nuisance
l1<-rnorm(n, mean = 0, sd = 1)
l2<-rnorm(n, mean = 0, sd = 1)
beta0<-1.5
beta1<-1
gamma0<--1
gamma1<--2
gamma2<-2
alpha0<-1
alpha1<-5
alpha2<-3
## Exposure generated from the exposure nuisance model
a<-rbinom(n,1,expit(alpha0 + alpha1*l1 + alpha2*l2))
## Outcome generated from the main model and the
## outcome nuisance model
y<-rnorm(n,
mean = beta0 * a + beta1 * a * l1 + gamma0 + gamma1 * l1 + gamma2 * l2,
sd = 1)
simdata<-data.frame(y,a,l1,l2)
## outcome nuisance model misspecified and
## exposure nuisance model correctly specified
## DR-estimation
dr.est <- drgee(oformula = formula(y~l1),
eformula = formula(a~l1+l2),
iaformula = formula(~l1),
olink = "identity", elink = "logit",
data = simdata, estimation.method = "dr")
summary(dr.est)
## O-estimation
o.est <- drgee(exposure = "a", oformula = formula(y~l1),
iaformula = formula(~l1), olink = "identity",
data = simdata, estimation.method = "o")
summary(o.est)
## E-estimation
e.est <- drgee(outcome = "y", eformula = formula(a~l1+l2),
iaformula = formula(~l1), elink="logit",
data = simdata, estimation.method = "e")
summary(e.est)