R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Fitting Generalized Linear Models (using S4)

glm4	R Documentation

Fitting Generalized Linear Models (using S4)

Description

glm4, very similarly as standard R's glm() is used to fit generalized linear models, specified by giving a symbolic description of the linear predictor and a description of the error distribution.

It is more general, as it fits linear, generalized linear, non-linear and generalized nonlinear models.

Usage

glm4(formula, family, data, weights, subset, na.action,
     start = NULL, etastart, mustart, offset,
     sparse = FALSE, drop.unused.levels = FALSE, doFit = TRUE,
     control = list(...),
     model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, ...)

Arguments

`formula`	an object of class `"formula"` (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.
`family`	a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See `family` for details of family functions.)
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `glm` is called.
`weights`	an optional vector of ‘prior weights’ to be used in the fitting process. Should be `NULL` or a numeric vector.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` setting of `options`, and is `na.fail` if that is unset. The ‘factory-fresh’ default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful.
`start, etastart, mustart`	starting values for the parameters in the linear predictor, the predictor itself and for the vector of means.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length equal to the number of cases. One or more `offset` terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See `model.offset`.
`sparse`	logical indicating if the model matrix should be sparse or not.
`drop.unused.levels`	used only when `sparse` is TRUE: Should factors have unused levels dropped? (This used to be true, implicitly in the first versions up to July 2010; the default has been changed for compatibility with R's standard (dense) `model.matrix()`.
`doFit`	logical indicating if the model should be fitted (or just returned unfitted).
`control`	a list with options on fitting; currently passed unchanged to (hidden) function `IRLS()`.
`model, x, y`	currently ignored; here for back compatibility with `glm`.
`contrasts`	currently ignored

...

potentially arguments passed on to fitter functions; not used currently.

Value

an object of class glpModel.

Examples

### All the following is very experimental -- and probably will change: -------

data(CO2, package="datasets")
## dense linear model
str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE), 4)
## sparse linear model
str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE,
                  sparse = TRUE), 4)

## From example(glm): -----------------

## Dobson (1990) Page 93: Randomized Controlled Trial :
str(trial <- data.frame(counts=c(18,17,15,20,10,20,25,13,12),
                        outcome=gl(3,1,9,labels=LETTERS[1:3]),
                        treatment=gl(3,3,labels=letters[1:3])))
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson, data=trial)
summary(glm.D93)
c.glm <- unname(coef(glm.D93))
glmM  <- glm4(counts ~ outcome + treatment, family = poisson, data=trial)
glmM2 <- update(glmM, quick = FALSE) # slightly more accurate
glmM3 <- update(glmM, quick = FALSE, finalUpdate = TRUE)
                 # finalUpdate has no effect on 'coef'
stopifnot( identical(glmM2@pred@coef, glmM3@pred@coef),
           all.equal(glmM @pred@coef, c.glm, tolerance=1e-7),
           all.equal(glmM2@pred@coef, c.glm, tolerance=1e-12))


## Watch the iterations --- and use no intercept --> more sparse X
## 1) dense generalized linear model
glmM <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
                      verbose = TRUE)
## 2) sparse generalized linear model
glmS <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
                      verbose = TRUE, sparse = TRUE)
str(glmS, max.lev = 4)
stopifnot( all.equal(glmM@pred@coef, glmS@pred@coef),
           all.equal(glmM@pred@Vtr,  glmS@pred@Vtr) )


## A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)
clotting <- data.frame(u = c(5,10,15,20,30,40,60,80,100),
                       lot1 = c(118,58,42,35,27,25,21,19,18),
                       lot2 = c(69,35,26,21,18,16,13,12,12))
str(gMN <- glm4(lot1 ~ log(u), data=clotting, family=Gamma, verbose=TRUE))
glm. <- glm(lot1 ~ log(u), data=clotting, family=Gamma)
stopifnot( all.equal(gMN@pred@coef, unname(coef(glm.)), tolerance=1e-7) )

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(MatrixModels)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/MatrixModels/glm4.Rd_%03d_medium.png", width=480, height=480)
> ### Name: glm4
> ### Title: Fitting Generalized Linear Models (using S4)
> ### Aliases: glm4
> ### Keywords: models regression
> 
> ### ** Examples
> 
> ### All the following is very experimental -- and probably will change: -------
> 
> data(CO2, package="datasets")
> ## dense linear model
> str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE), 4)
Formal class 'glpModel' [package "MatrixModels"] with 4 slots
  ..@ resp    :Formal class 'respModule' [package "MatrixModels"] with 7 slots
  .. .. ..@ mu     : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ offset : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ sqrtXwt: num [1:84, 1] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ sqrtrwt: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ weights: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ wtres  : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ y      : num [1:84] 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
  ..@ pred    :Formal class 'dPredModule' [package "MatrixModels"] with 4 slots
  .. .. ..@ X   :Formal class 'ddenseModelMatrix' [package "MatrixModels"] with 6 slots
  .. .. ..@ fac :Formal class 'Cholesky' [package "Matrix"] with 5 slots
  .. .. ..@ coef: num [1:4] 0 0 0 0
  .. .. ..@ Vtr : num [1:4] 0 0 0 0
  ..@ call    : language glm4(formula = uptake ~ 0 + Type * Treatment, data = CO2, doFit = FALSE)
  ..@ fitProps: list()
> ## sparse linear model
> str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE,
+                   sparse = TRUE), 4)
Formal class 'glpModel' [package "MatrixModels"] with 4 slots
  ..@ resp    :Formal class 'respModule' [package "MatrixModels"] with 7 slots
  .. .. ..@ mu     : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ offset : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ sqrtXwt: num [1:84, 1] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ sqrtrwt: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ weights: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ wtres  : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..@ y      : num [1:84] 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
  ..@ pred    :Formal class 'sPredModule' [package "MatrixModels"] with 4 slots
  .. .. ..@ X   :Formal class 'dsparseModelMatrix' [package "MatrixModels"] with 8 slots
  .. .. ..@ fac :Formal class 'dCHMsimpl' [package "Matrix"] with 10 slots
  .. .. ..@ coef: num [1:4] 0 0 0 0
  .. .. ..@ Vtr : num [1:4] 0 0 0 0
  ..@ call    : language glm4(formula = uptake ~ 0 + Type * Treatment, data = CO2, sparse = TRUE,      doFit = FALSE)
  ..@ fitProps: list()
> 
> ## From example(glm): -----------------
> 
> ## Dobson (1990) Page 93: Randomized Controlled Trial :
> str(trial <- data.frame(counts=c(18,17,15,20,10,20,25,13,12),
+                         outcome=gl(3,1,9,labels=LETTERS[1:3]),
+                         treatment=gl(3,3,labels=letters[1:3])))
'data.frame':	9 obs. of  3 variables:
 $ counts   : num  18 17 15 20 10 20 25 13 12
 $ outcome  : Factor w/ 3 levels "A","B","C": 1 2 3 1 2 3 1 2 3
 $ treatment: Factor w/ 3 levels "a","b","c": 1 1 1 2 2 2 3 3 3
> glm.D93 <- glm(counts ~ outcome + treatment, family=poisson, data=trial)
> summary(glm.D93)

Call:
glm(formula = counts ~ outcome + treatment, family = poisson, 
    data = trial)

Deviance Residuals: 
       1         2         3         4         5         6         7         8  
-0.67125   0.96272  -0.16965  -0.21999  -0.95552   1.04939   0.84715  -0.09167  
       9  
-0.96656  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)  3.045e+00  1.709e-01  17.815   <2e-16 ***
outcomeB    -4.543e-01  2.022e-01  -2.247   0.0246 *  
outcomeC    -2.930e-01  1.927e-01  -1.520   0.1285    
treatmentb   1.338e-15  2.000e-01   0.000   1.0000    
treatmentc   1.421e-15  2.000e-01   0.000   1.0000    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 10.5814  on 8  degrees of freedom
Residual deviance:  5.1291  on 4  degrees of freedom
AIC: 56.761

Number of Fisher Scoring iterations: 4

> c.glm <- unname(coef(glm.D93))
> glmM  <- glm4(counts ~ outcome + treatment, family = poisson, data=trial)
> glmM2 <- update(glmM, quick = FALSE) # slightly more accurate
> glmM3 <- update(glmM, quick = FALSE, finalUpdate = TRUE)
>                  # finalUpdate has no effect on 'coef'
> stopifnot( identical(glmM2@pred@coef, glmM3@pred@coef),
+            all.equal(glmM @pred@coef, c.glm, tolerance=1e-7),
+            all.equal(glmM2@pred@coef, c.glm, tolerance=1e-12))
> ## Don't show: 
> All.eq <- function(x,y, ...) all.equal(x,y, tolerance= 1e-12, ...)
> stopifnot( ## ensure typos are *caught* :
+   inherits(try(glm4(counts ~ outcome + treatment, family=poisson, data=trial,
+                        fooBar = FALSE)), "try-error"),
+   ## check formula(.): {environments differ - FIXME?}
+   formula(glmM) == formula(glm.D93),
+   identical(coef(glmM2), coefficients(glmM3)),
+   All.eq   (coef(glmM2), coefficients(glm.D93)),
+   identical(fitted.values(glmM2), fitted(glmM3)),
+   All.eq   (residuals(glmM2), resid(glm.D93), check.attributes=FALSE),# names()% FIXME ??
+   identical(residuals(glmM2), resid(glmM3))
+ )
Error : The following control arguments did not match any default's names:
   "fooBar"
> ## End(Don't show)
> 
> ## Watch the iterations --- and use no intercept --> more sparse X
> ## 1) dense generalized linear model
> glmM <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
+                       verbose = TRUE)
_1_ convergence criterion: 0.0990093
step = 1.00000, new wrss = 5.0726703, Delta(wrss)= 0.0499074, coef =
[1] 3.044512e+00 2.590484e+00 2.751707e+00 1.163981e-04 6.095497e-05
_2_ convergence criterion: 0.00107995
step = 1.00000, new wrss = 5.1721149, Delta(wrss)= 5.94911e-06, coef =
[1] 3.044522e+00 2.590267e+00 2.751535e+00 1.923493e-08 8.383392e-09
_3_ convergence criterion: 1.43931e-07
> ## 2) sparse generalized linear model
> glmS <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
+                       verbose = TRUE, sparse = TRUE)
_1_ convergence criterion: 0.0990093
step = 1.00000, new wrss = 5.0726703, Delta(wrss)= 0.0499074, coef =
[1] 3.044512e+00 2.590484e+00 2.751707e+00 1.163981e-04 6.095497e-05
_2_ convergence criterion: 0.00107995
step = 1.00000, new wrss = 5.1721149, Delta(wrss)= 5.94911e-06, coef =
[1] 3.044522e+00 2.590267e+00 2.751535e+00 1.923493e-08 8.383392e-09
_3_ convergence criterion: 1.43931e-07
> str(glmS, max.lev = 4)
Formal class 'glpModel' [package "MatrixModels"] with 4 slots
  ..@ resp    :Formal class 'glmRespMod' [package "MatrixModels"] with 10 slots
  .. .. ..@ family :List of 11
  .. .. .. ..- attr(*, "class")= chr "family"
  .. .. ..@ eta    : num [1:9] 3.04 2.59 2.75 3.04 2.59 ...
  .. .. ..@ n      : num [1:9] 1 1 1 1 1 1 1 1 1
  .. .. ..@ mu     : num [1:9] 21 13.3 15.7 21 13.3 ...
  .. .. ..@ offset : num [1:9] 0 0 0 0 0 0 0 0 0
  .. .. ..@ sqrtXwt: num [1:9, 1] 4.58 3.65 3.96 4.58 3.65 ...
  .. .. ..@ sqrtrwt: num [1:9] 0.218 0.274 0.253 0.218 0.274 ...
  .. .. ..@ weights: num [1:9] 1 1 1 1 1 1 1 1 1
  .. .. ..@ wtres  : num [1:9] -0.655 1.004 -0.168 -0.218 -0.913 ...
  .. .. ..@ y      : num [1:9] 18 17 15 20 10 20 25 13 12
  ..@ pred    :Formal class 'sPredModule' [package "MatrixModels"] with 4 slots
  .. .. ..@ X   :Formal class 'dsparseModelMatrix' [package "MatrixModels"] with 8 slots
  .. .. ..@ fac :Formal class 'dCHMsimpl' [package "Matrix"] with 10 slots
  .. .. ..@ coef: num [1:5] 3.04 2.59 2.75 1.92e-08 8.38e-09
  .. .. ..@ Vtr : num [1:5] -1.45e-07 -1.57e-06 -1.31e-06 -1.51e-06 -9.64e-07
  ..@ call    : language glm4(formula = counts ~ 0 + outcome + treatment, family = poisson, data = trial,      sparse = TRUE, verbose = TRUE)
  ..@ fitProps:List of 3
  .. ..$ convcrit : num 1.44e-07
  .. ..$ iter     : num 3
  .. ..$ nHalvings: num 0
> stopifnot( all.equal(glmM@pred@coef, glmS@pred@coef),
+            all.equal(glmM@pred@Vtr,  glmS@pred@Vtr) )
> 
> 
> ## A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)
> clotting <- data.frame(u = c(5,10,15,20,30,40,60,80,100),
+                        lot1 = c(118,58,42,35,27,25,21,19,18),
+                        lot2 = c(69,35,26,21,18,16,13,12,12))
> str(gMN <- glm4(lot1 ~ log(u), data=clotting, family=Gamma, verbose=TRUE))
_1_ convergence criterion: 0.0455001
step = 1.00000, new wrss = 0.017051829, Delta(wrss)= 3.5175e-05, coef =
[1] -0.01655447  0.01534311
_2_ convergence criterion: 0.000110842
step = 1.00000, new wrss = 0.017122092, Delta(wrss)= 2.0547e-10, coef =
[1] -0.01655438  0.01534311
_3_ convergence criterion: 1.02e-09
Formal class 'glpModel' [package "MatrixModels"] with 4 slots
  ..@ resp    :Formal class 'glmRespMod' [package "MatrixModels"] with 10 slots
  .. .. ..@ family :List of 11
  .. .. .. ..$ family    : chr "Gamma"
  .. .. .. ..$ link      : chr "inverse"
  .. .. .. ..$ linkfun   :function (mu)  
  .. .. .. ..$ linkinv   :function (eta)  
  .. .. .. ..$ variance  :function (mu)  
  .. .. .. ..$ dev.resids:function (y, mu, wt)  
  .. .. .. ..$ aic       :function (y, n, mu, wt, dev)  
  .. .. .. ..$ mu.eta    :function (eta)  
  .. .. .. ..$ validmu   :function (mu)  
  .. .. .. ..$ valideta  :function (eta)  
  .. .. .. ..$ simulate  :function (object, nsim)  
  .. .. .. ..- attr(*, "class")= chr "family"
  .. .. ..@ eta    : num [1:9] 0.00814 0.01877 0.025 0.02941 0.03563 ...
  .. .. ..@ n      : num [1:9] 1 1 1 1 1 1 1 1 1
  .. .. ..@ mu     : num [1:9] 122.9 53.3 40 34 28.1 ...
  .. .. ..@ offset : num [1:9] 0 0 0 0 0 0 0 0 0
  .. .. ..@ sqrtXwt: num [1:9, 1] -122.9 -53.3 -40 -34 -28.1 ...
  .. .. ..@ sqrtrwt: num [1:9] 0.00814 0.01877 0.025 0.02941 0.03563 ...
  .. .. ..@ weights: num [1:9] 1 1 1 1 1 1 1 1 1
  .. .. ..@ wtres  : num [1:9] -0.0395 0.0889 0.0498 0.0293 -0.038 ...
  .. .. ..@ y      : num [1:9] 118 58 42 35 27 25 21 19 18
  ..@ pred    :Formal class 'dPredModule' [package "MatrixModels"] with 4 slots
  .. .. ..@ X   :Formal class 'ddenseModelMatrix' [package "MatrixModels"] with 6 slots
  .. .. .. .. ..@ x        : num [1:18] 1 1 1 1 1 ...
  .. .. .. .. ..@ Dim      : int [1:2] 9 2
  .. .. .. .. ..@ Dimnames :List of 2
  .. .. .. .. .. ..$ : chr [1:9] "1" "2" "3" "4" ...
  .. .. .. .. .. ..$ : chr [1:2] "(Intercept)" "log(u)"
  .. .. .. .. ..@ factors  : list()
  .. .. .. .. ..@ assign   : int [1:2] 0 1
  .. .. .. .. ..@ contrasts: list()
  .. .. ..@ fac :Formal class 'Cholesky' [package "Matrix"] with 5 slots
  .. .. .. .. ..@ x       : num [1:4] 153 0 320 119
  .. .. .. .. ..@ Dim     : int [1:2] 2 2
  .. .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. .. ..$ : NULL
  .. .. .. .. .. ..$ : NULL
  .. .. .. .. ..@ uplo    : chr "U"
  .. .. .. .. ..@ diag    : chr "N"
  .. .. ..@ coef: num [1:2] -0.0166 0.0153
  .. .. ..@ Vtr : num [1:2] 1.91e-08 3.43e-08
  ..@ call    : language glm4(formula = lot1 ~ log(u), family = Gamma, data = clotting, verbose = TRUE)
  ..@ fitProps:List of 3
  .. ..$ convcrit : num 1.02e-09
  .. ..$ iter     : num 3
  .. ..$ nHalvings: num 0
> glm. <- glm(lot1 ~ log(u), data=clotting, family=Gamma)
> stopifnot( all.equal(gMN@pred@coef, unname(coef(glm.)), tolerance=1e-7) )
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>