Last data update: 2014.03.03

R: Fitting Generalized Linear Models (using S4)
glm4R Documentation

Fitting Generalized Linear Models (using S4)

Description

glm4, very similarly as standard R's glm() is used to fit generalized linear models, specified by giving a symbolic description of the linear predictor and a description of the error distribution.

It is more general, as it fits linear, generalized linear, non-linear and generalized nonlinear models.

Usage

glm4(formula, family, data, weights, subset, na.action,
     start = NULL, etastart, mustart, offset,
     sparse = FALSE, drop.unused.levels = FALSE, doFit = TRUE,
     control = list(...),
     model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, ...)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.

family

a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions.)

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which glm is called.

weights

an optional vector of ‘prior weights’ to be used in the fitting process. Should be NULL or a numeric vector.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The ‘factory-fresh’ default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.

start, etastart, mustart

starting values for the parameters in the linear predictor, the predictor itself and for the vector of means.

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See model.offset.

sparse

logical indicating if the model matrix should be sparse or not.

drop.unused.levels

used only when sparse is TRUE: Should factors have unused levels dropped? (This used to be true, implicitly in the first versions up to July 2010; the default has been changed for compatibility with R's standard (dense) model.matrix().

doFit

logical indicating if the model should be fitted (or just returned unfitted).

control

a list with options on fitting; currently passed unchanged to (hidden) function IRLS().

model, x, y

currently ignored; here for back compatibility with glm.

contrasts

currently ignored

...

potentially arguments passed on to fitter functions; not used currently.

Value

an object of class glpModel.

See Also

glm() the standard R function;
lm.fit.sparse() a sparse least squares fitter.

The resulting class glpModel documentation.

Examples

### All the following is very experimental -- and probably will change: -------

data(CO2, package="datasets")
## dense linear model
str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE), 4)
## sparse linear model
str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE,
                  sparse = TRUE), 4)

## From example(glm): -----------------

## Dobson (1990) Page 93: Randomized Controlled Trial :
str(trial <- data.frame(counts=c(18,17,15,20,10,20,25,13,12),
                        outcome=gl(3,1,9,labels=LETTERS[1:3]),
                        treatment=gl(3,3,labels=letters[1:3])))
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson, data=trial)
summary(glm.D93)
c.glm <- unname(coef(glm.D93))
glmM  <- glm4(counts ~ outcome + treatment, family = poisson, data=trial)
glmM2 <- update(glmM, quick = FALSE) # slightly more accurate
glmM3 <- update(glmM, quick = FALSE, finalUpdate = TRUE)
                 # finalUpdate has no effect on 'coef'
stopifnot( identical(glmM2@pred@coef, glmM3@pred@coef),
           all.equal(glmM @pred@coef, c.glm, tolerance=1e-7),
           all.equal(glmM2@pred@coef, c.glm, tolerance=1e-12))


## Watch the iterations --- and use no intercept --> more sparse X
## 1) dense generalized linear model
glmM <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
                      verbose = TRUE)
## 2) sparse generalized linear model
glmS <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
                      verbose = TRUE, sparse = TRUE)
str(glmS, max.lev = 4)
stopifnot( all.equal(glmM@pred@coef, glmS@pred@coef),
           all.equal(glmM@pred@Vtr,  glmS@pred@Vtr) )


## A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)
clotting <- data.frame(u = c(5,10,15,20,30,40,60,80,100),
                       lot1 = c(118,58,42,35,27,25,21,19,18),
                       lot2 = c(69,35,26,21,18,16,13,12,12))
str(gMN <- glm4(lot1 ~ log(u), data=clotting, family=Gamma, verbose=TRUE))
glm. <- glm(lot1 ~ log(u), data=clotting, family=Gamma)
stopifnot( all.equal(gMN@pred@coef, unname(coef(glm.)), tolerance=1e-7) )

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(MatrixModels)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/MatrixModels/glm4.Rd_%03d_medium.png", width=480, height=480)
> ### Name: glm4
> ### Title: Fitting Generalized Linear Models (using S4)
> ### Aliases: glm4
> ### Keywords: models regression
> 
> ### ** Examples
> 
> ### All the following is very experimental -- and probably will change: -------
> 
> data(CO2, package="datasets")
> ## dense linear model
> str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE), 4)
Formal class 'glpModel' [package "MatrixModels"] with 4 slots
  ..@ resp    :Formal class 'respModule' [package "MatrixModels"] with 7 slots
  .. .. ..@ mu     : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ offset : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ sqrtXwt: num [1:84, 1] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ sqrtrwt: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ weights: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ wtres  : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ y      : num [1:84] 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
  ..@ pred    :Formal class 'dPredModule' [package "MatrixModels"] with 4 slots
  .. .. ..@ X   :Formal class 'ddenseModelMatrix' [package "MatrixModels"] with 6 slots
  .. .. ..@ fac :Formal class 'Cholesky' [package "Matrix"] with 5 slots
  .. .. ..@ coef: num [1:4] 0 0 0 0
  .. .. ..@ Vtr : num [1:4] 0 0 0 0
  ..@ call    : language glm4(formula = uptake ~ 0 + Type * Treatment, data = CO2, doFit = FALSE)
  ..@ fitProps: list()
> ## sparse linear model
> str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE,
+                   sparse = TRUE), 4)
Formal class 'glpModel' [package "MatrixModels"] with 4 slots
  ..@ resp    :Formal class 'respModule' [package "MatrixModels"] with 7 slots
  .. .. ..@ mu     : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ offset : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ sqrtXwt: num [1:84, 1] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ sqrtrwt: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ weights: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ wtres  : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ y      : num [1:84] 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
  ..@ pred    :Formal class 'sPredModule' [package "MatrixModels"] with 4 slots
  .. .. ..@ X   :Formal class 'dsparseModelMatrix' [package "MatrixModels"] with 8 slots
  .. .. ..@ fac :Formal class 'dCHMsimpl' [package "Matrix"] with 10 slots
  .. .. ..@ coef: num [1:4] 0 0 0 0
  .. .. ..@ Vtr : num [1:4] 0 0 0 0
  ..@ call    : language glm4(formula = uptake ~ 0 + Type * Treatment, data = CO2, sparse = TRUE,      doFit = FALSE)
  ..@ fitProps: list()
> 
> ## From example(glm): -----------------
> 
> ## Dobson (1990) Page 93: Randomized Controlled Trial :
> str(trial <- data.frame(counts=c(18,17,15,20,10,20,25,13,12),
+                         outcome=gl(3,1,9,labels=LETTERS[1:3]),
+                         treatment=gl(3,3,labels=letters[1:3])))
'data.frame':	9 obs. of  3 variables:
 $ counts   : num  18 17 15 20 10 20 25 13 12
 $ outcome  : Factor w/ 3 levels "A","B","C": 1 2 3 1 2 3 1 2 3
 $ treatment: Factor w/ 3 levels "a","b","c": 1 1 1 2 2 2 3 3 3
> glm.D93 <- glm(counts ~ outcome + treatment, family=poisson, data=trial)
> summary(glm.D93)

Call:
glm(formula = counts ~ outcome + treatment, family = poisson, 
    data = trial)

Deviance Residuals: 
       1         2         3         4         5         6         7         8  
-0.67125   0.96272  -0.16965  -0.21999  -0.95552   1.04939   0.84715  -0.09167  
       9  
-0.96656  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)  3.045e+00  1.709e-01  17.815   <2e-16 ***
outcomeB    -4.543e-01  2.022e-01  -2.247   0.0246 *  
outcomeC    -2.930e-01  1.927e-01  -1.520   0.1285    
treatmentb   1.338e-15  2.000e-01   0.000   1.0000    
treatmentc   1.421e-15  2.000e-01   0.000   1.0000    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 10.5814  on 8  degrees of freedom
Residual deviance:  5.1291  on 4  degrees of freedom
AIC: 56.761

Number of Fisher Scoring iterations: 4

> c.glm <- unname(coef(glm.D93))
> glmM  <- glm4(counts ~ outcome + treatment, family = poisson, data=trial)
> glmM2 <- update(glmM, quick = FALSE) # slightly more accurate
> glmM3 <- update(glmM, quick = FALSE, finalUpdate = TRUE)
>                  # finalUpdate has no effect on 'coef'
> stopifnot( identical(glmM2@pred@coef, glmM3@pred@coef),
+            all.equal(glmM @pred@coef, c.glm, tolerance=1e-7),
+            all.equal(glmM2@pred@coef, c.glm, tolerance=1e-12))
> ## Don't show: 
> All.eq <- function(x,y, ...) all.equal(x,y, tolerance= 1e-12, ...)
> stopifnot( ## ensure typos are *caught* :
+   inherits(try(glm4(counts ~ outcome + treatment, family=poisson, data=trial,
+                        fooBar = FALSE)), "try-error"),
+   ## check formula(.): {environments differ - FIXME?}
+   formula(glmM) == formula(glm.D93),
+   identical(coef(glmM2), coefficients(glmM3)),
+   All.eq   (coef(glmM2), coefficients(glm.D93)),
+   identical(fitted.values(glmM2), fitted(glmM3)),
+   All.eq   (residuals(glmM2), resid(glm.D93), check.attributes=FALSE),# names()% FIXME ??
+   identical(residuals(glmM2), resid(glmM3))
+ )
Error : The following control arguments did not match any default's names:
   "fooBar"
> ## End(Don't show)
> 
> ## Watch the iterations --- and use no intercept --> more sparse X
> ## 1) dense generalized linear model
> glmM <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
+                       verbose = TRUE)
_1_ convergence criterion: 0.0990093
step = 1.00000, new wrss = 5.0726703, Delta(wrss)= 0.0499074, coef =
[1] 3.044512e+00 2.590484e+00 2.751707e+00 1.163981e-04 6.095497e-05
_2_ convergence criterion: 0.00107995
step = 1.00000, new wrss = 5.1721149, Delta(wrss)= 5.94911e-06, coef =
[1] 3.044522e+00 2.590267e+00 2.751535e+00 1.923493e-08 8.383392e-09
_3_ convergence criterion: 1.43931e-07
> ## 2) sparse generalized linear model
> glmS <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
+                       verbose = TRUE, sparse = TRUE)
_1_ convergence criterion: 0.0990093
step = 1.00000, new wrss = 5.0726703, Delta(wrss)= 0.0499074, coef =
[1] 3.044512e+00 2.590484e+00 2.751707e+00 1.163981e-04 6.095497e-05
_2_ convergence criterion: 0.00107995
step = 1.00000, new wrss = 5.1721149, Delta(wrss)= 5.94911e-06, coef =
[1] 3.044522e+00 2.590267e+00 2.751535e+00 1.923493e-08 8.383392e-09
_3_ convergence criterion: 1.43931e-07
> str(glmS, max.lev = 4)
Formal class 'glpModel' [package "MatrixModels"] with 4 slots
  ..@ resp    :Formal class 'glmRespMod' [package "MatrixModels"] with 10 slots
  .. .. ..@ family :List of 11
  .. .. .. ..- attr(*, "class")= chr "family"
  .. .. ..@ eta    : num [1:9] 3.04 2.59 2.75 3.04 2.59 ...
  .. .. ..@ n      : num [1:9] 1 1 1 1 1 1 1 1 1
  .. .. ..@ mu     : num [1:9] 21 13.3 15.7 21 13.3 ...
  .. .. ..@ offset : num [1:9] 0 0 0 0 0 0 0 0 0
  .. .. ..@ sqrtXwt: num [1:9, 1] 4.58 3.65 3.96 4.58 3.65 ...
  .. .. ..@ sqrtrwt: num [1:9] 0.218 0.274 0.253 0.218 0.274 ...
  .. .. ..@ weights: num [1:9] 1 1 1 1 1 1 1 1 1
  .. .. ..@ wtres  : num [1:9] -0.655 1.004 -0.168 -0.218 -0.913 ...
  .. .. ..@ y      : num [1:9] 18 17 15 20 10 20 25 13 12
  ..@ pred    :Formal class 'sPredModule' [package "MatrixModels"] with 4 slots
  .. .. ..@ X   :Formal class 'dsparseModelMatrix' [package "MatrixModels"] with 8 slots
  .. .. ..@ fac :Formal class 'dCHMsimpl' [package "Matrix"] with 10 slots
  .. .. ..@ coef: num [1:5] 3.04 2.59 2.75 1.92e-08 8.38e-09
  .. .. ..@ Vtr : num [1:5] -1.45e-07 -1.57e-06 -1.31e-06 -1.51e-06 -9.64e-07
  ..@ call    : language glm4(formula = counts ~ 0 + outcome + treatment, family = poisson, data = trial,      sparse = TRUE, verbose = TRUE)
  ..@ fitProps:List of 3
  .. ..$ convcrit : num 1.44e-07
  .. ..$ iter     : num 3
  .. ..$ nHalvings: num 0
> stopifnot( all.equal(glmM@pred@coef, glmS@pred@coef),
+            all.equal(glmM@pred@Vtr,  glmS@pred@Vtr) )
> 
> 
> ## A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)
> clotting <- data.frame(u = c(5,10,15,20,30,40,60,80,100),
+                        lot1 = c(118,58,42,35,27,25,21,19,18),
+                        lot2 = c(69,35,26,21,18,16,13,12,12))
> str(gMN <- glm4(lot1 ~ log(u), data=clotting, family=Gamma, verbose=TRUE))
_1_ convergence criterion: 0.0455001
step = 1.00000, new wrss = 0.017051829, Delta(wrss)= 3.5175e-05, coef =
[1] -0.01655447  0.01534311
_2_ convergence criterion: 0.000110842
step = 1.00000, new wrss = 0.017122092, Delta(wrss)= 2.0547e-10, coef =
[1] -0.01655438  0.01534311
_3_ convergence criterion: 1.02e-09
Formal class 'glpModel' [package "MatrixModels"] with 4 slots
  ..@ resp    :Formal class 'glmRespMod' [package "MatrixModels"] with 10 slots
  .. .. ..@ family :List of 11
  .. .. .. ..$ family    : chr "Gamma"
  .. .. .. ..$ link      : chr "inverse"
  .. .. .. ..$ linkfun   :function (mu)  
  .. .. .. ..$ linkinv   :function (eta)  
  .. .. .. ..$ variance  :function (mu)  
  .. .. .. ..$ dev.resids:function (y, mu, wt)  
  .. .. .. ..$ aic       :function (y, n, mu, wt, dev)  
  .. .. .. ..$ mu.eta    :function (eta)  
  .. .. .. ..$ validmu   :function (mu)  
  .. .. .. ..$ valideta  :function (eta)  
  .. .. .. ..$ simulate  :function (object, nsim)  
  .. .. .. ..- attr(*, "class")= chr "family"
  .. .. ..@ eta    : num [1:9] 0.00814 0.01877 0.025 0.02941 0.03563 ...
  .. .. ..@ n      : num [1:9] 1 1 1 1 1 1 1 1 1
  .. .. ..@ mu     : num [1:9] 122.9 53.3 40 34 28.1 ...
  .. .. ..@ offset : num [1:9] 0 0 0 0 0 0 0 0 0
  .. .. ..@ sqrtXwt: num [1:9, 1] -122.9 -53.3 -40 -34 -28.1 ...
  .. .. ..@ sqrtrwt: num [1:9] 0.00814 0.01877 0.025 0.02941 0.03563 ...
  .. .. ..@ weights: num [1:9] 1 1 1 1 1 1 1 1 1
  .. .. ..@ wtres  : num [1:9] -0.0395 0.0889 0.0498 0.0293 -0.038 ...
  .. .. ..@ y      : num [1:9] 118 58 42 35 27 25 21 19 18
  ..@ pred    :Formal class 'dPredModule' [package "MatrixModels"] with 4 slots
  .. .. ..@ X   :Formal class 'ddenseModelMatrix' [package "MatrixModels"] with 6 slots
  .. .. .. .. ..@ x        : num [1:18] 1 1 1 1 1 ...
  .. .. .. .. ..@ Dim      : int [1:2] 9 2
  .. .. .. .. ..@ Dimnames :List of 2
  .. .. .. .. .. ..$ : chr [1:9] "1" "2" "3" "4" ...
  .. .. .. .. .. ..$ : chr [1:2] "(Intercept)" "log(u)"
  .. .. .. .. ..@ factors  : list()
  .. .. .. .. ..@ assign   : int [1:2] 0 1
  .. .. .. .. ..@ contrasts: list()
  .. .. ..@ fac :Formal class 'Cholesky' [package "Matrix"] with 5 slots
  .. .. .. .. ..@ x       : num [1:4] 153 0 320 119
  .. .. .. .. ..@ Dim     : int [1:2] 2 2
  .. .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. ..$ : NULL
  .. .. .. .. ..@ uplo    : chr "U"
  .. .. .. .. ..@ diag    : chr "N"
  .. .. ..@ coef: num [1:2] -0.0166 0.0153
  .. .. ..@ Vtr : num [1:2] 1.91e-08 3.43e-08
  ..@ call    : language glm4(formula = lot1 ~ log(u), family = Gamma, data = clotting, verbose = TRUE)
  ..@ fitProps:List of 3
  .. ..$ convcrit : num 1.02e-09
  .. ..$ iter     : num 3
  .. ..$ nHalvings: num 0
> glm. <- glm(lot1 ~ log(u), data=clotting, family=Gamma)
> stopifnot( all.equal(gMN@pred@coef, unname(coef(glm.)), tolerance=1e-7) )
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>