R: fit a GLM with fusion penalty for data integraion
metafuse.l
R Documentation
fit a GLM with fusion penalty for data integraion
Description
Fit a GLM with fusion penalty on coefficients within each covariate at given lambda.
Usage
metafuse.l(X = X, y = y, sid = sid, fuse.which = c(0:ncol(X)),
family = "gaussian", intercept = TRUE, alpha = 0, lambda = lambda)
Arguments
X
a matrix (or vector) of predictor(s), with dimensions of N*p, where N is the total sample size of all studies
y
a vector of response, with length N, the total sample size of all studies
sid
study id, numbered from 1 to K
fuse.which
a vector of a subset of integers from 0 to p, indicating which covariates to be considered for fusion; 0 corresponds to intercept
family
"gaussian" for continuous response, "binomial" for binary response, "poisson" for count response
intercept
if TRUE, intercept will be included in the model
alpha
the ratio of sparsity penalty to fusion penalty, default is 0 (no penalty on sparsity)
lambda
tuning parameter for fusion penalty
Details
Adaptive lasso penalty is used. See Zou (2006) for detail.
Value
a list containing the following items will be returned:
family
the model type
alpha
the ratio of sparsity penalty to fusion penalty
if.fuse
whether the covariate is fused (1) or not (0)
betahat
the estimated coefficients
betainfo
additional information about the fit, including degree of freedom
Examples
n <- 200 # sample size in each study
K <- 10 # number of studies
p <- 3 # number of covariates in X (including intercept)
N <- n*K # total sample size
# the coefficient matrix, used this to set desired heterogeneous pattern (depends on p and K)
beta0 <- matrix(c(0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0, # intercept
0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0, # beta_1, etc.
0.0,0.0,0.0,0.0,0.5,0.5,0.5,1.0,1.0,1.0), K, p)
# generate a data set, family=c("gaussian", "binomial", "poisson")
data <- datagenerator(n=n, beta0=beta0, family="gaussian", seed=123)
# prepare the input (y, X, studyID)
y <- data$y
sid <- data$group
X <- data[,-c(1,ncol(data))]
# fit metafuse at a given lambda
metafuse.l(X=X, y=y, sid=sid, fuse.which=c(0,1,2), family="gaussian", intercept=TRUE,
alpha=1, lambda=0.5)