R: Display Hosmer-Lemeshow statistic and table of probabilities...
hlGOF.test
R Documentation
Display Hosmer-Lemeshow statistic and table of probabilities following logistic
regression using glm with binomial family.
Description
Provides a Hosmer-Lemeshow statistic and table following logistic regression.
Usage
hlGOF.test(observed, predicted, breaks = 15)
Arguments
observed
response variable
predicted
predicted statistic
breaks
breaks or groups
Format
x
The function has three arguments: observed term, predicted values, # groups
Details
hlGOF.test is a post-estimation function for logistic regression, following the use
of glm(). Usage displays a table of observed vs predicted groups and an overall
H-L goodness-of-fit statistic. The test is originally from Hilbe (2009).
Value
numeric
Note
hlGOF.test must be loaded into memory in order to be effectve. As a function in LOGIT,
it is immediately available to a user. My thanks to Prof. Robert LaBudde for the initial
version of this function.
Author(s)
Joseph M. Hilbe, Arizona State University, Robert LaBudde,
Institute for Statisical Education (Statistics.com), provided initial code
for this function for Hilbe, Logistic Regression Models, text.
References
Hilbe, J. M. (2015), Practical Guide to Logistic Regression, Chapman & Hall/CRC.
Hilbe, J. M. (2009), Logistic Regression Models, Chapman & Hall/CRC.
See Also
glm
Examples
library(MASS)
library(LOGIT)
data(medpar)
mylogit <- glm( died ~ los + white + hmo, family=binomial, data=medpar)
summary(mylogit)
medpar2 <- na.omit(medpar)
hlGOF.test(medpar2$died, predict(mylogit,medpar2, type="response"), breaks=12)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(LOGIT)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/LOGIT/hlGOF.test.Rd_%03d_medium.png", width=480, height=480)
> ### Name: hlGOF.test
> ### Title: Display Hosmer-Lemeshow statistic and table of probabilities
> ### following logistic regression using glm with binomial family.
> ### Aliases: hlGOF.test
> ### Keywords: models
>
> ### ** Examples
>
> library(MASS)
> library(LOGIT)
> data(medpar)
> mylogit <- glm( died ~ los + white + hmo, family=binomial, data=medpar)
> summary(mylogit)
Call:
glm(formula = died ~ los + white + hmo, family = binomial, data = medpar)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.0258 -0.9436 -0.8655 1.3637 2.5948
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.593328 0.214017 -2.772 0.00557 **
los -0.030088 0.007711 -3.902 9.54e-05 ***
white 0.255677 0.206801 1.236 0.21633
hmo -0.044626 0.149650 -0.298 0.76555
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1922.9 on 1494 degrees of freedom
Residual deviance: 1902.9 on 1491 degrees of freedom
AIC: 1910.9
Number of Fisher Scoring iterations: 4
> medpar2 <- na.omit(medpar)
> hlGOF.test(medpar2$died, predict(mylogit,medpar2, type="response"), breaks=12)
Hosmer-Lemeshow GOF test
For # Cuts = 9 # Data = 1495
Cut # Total #Patterns # Resp. # Pred. Mean Resp. Mean Pred.
1 166 63 58 38.95 0.34940 0.23465
2 166 20 50 49.69 0.30120 0.29936
3 166 12 42 53.44 0.25301 0.32194
4 166 8 41 56.27 0.24699 0.33900
5 167 6 42 58.97 0.25150 0.35313
6 166 5 47 60.77 0.28313 0.36609
7 166 3 45 62.68 0.27108 0.37761
8 166 6 67 64.81 0.40361 0.39044
9 166 3 121 67.39 0.72892 0.40599
Total # Data: 1495 Total over cuts: 1495
Chisq: 114.4468 d.f.: 7 P-value: 0.00000
For # Cuts = 12 # Data = 1495
Cut # Total #Patterns # Resp. # Pred. Mean Resp. Mean Pred.
1 125 57 39 27.60 0.31200 0.22079
2 124 18 42 35.58 0.33871 0.28696
3 125 12 37 38.64 0.29600 0.30916
4 124 9 32 40.26 0.25806 0.32470
5 125 7 30 42.11 0.24000 0.33690
6 125 5 36 43.49 0.28800 0.34789
7 124 4 30 44.42 0.24194 0.35826
8 125 4 34 46.00 0.27200 0.36797
9 124 3 30 46.71 0.24194 0.37668
10 125 3 39 48.08 0.31200 0.38463
11 124 5 68 49.15 0.54839 0.39637
12 125 2 96 50.95 0.76800 0.40764
Total # Data: 1495 Total over cuts: 1495
Chisq: 121.3332 d.f.: 10 P-value: 0.00000
For # Cuts = 15 # Data = 1495
Cut # Total #Patterns # Resp. # Pred. Mean Resp. Mean Pred.
1 100 51 32 20.98 0.32000 0.20979
2 99 18 35 27.44 0.35354 0.27715
3 100 12 27 30.00 0.27000 0.29998
4 100 6 33 31.38 0.33000 0.31385
5 99 9 23 32.29 0.23232 0.32617
6 100 4 25 33.57 0.25000 0.33573
7 100 5 26 34.45 0.26000 0.34451
8 99 5 24 35.01 0.24242 0.35360
9 100 3 29 36.14 0.29000 0.36138
10 100 3 26 36.85 0.26000 0.36850
11 99 3 23 37.20 0.23232 0.37576
12 100 3 32 38.21 0.32000 0.38214
13 100 3 36 39.02 0.36000 0.39019
14 99 3 62 39.55 0.62626 0.39949
15 100 1 80 40.91 0.80000 0.40909
Total # Data: 1495 Total over cuts: 1495
Chisq: 128.9439 d.f.: 13 P-value: 0.00000
Minimum P-value: 0.00000
>
>
>
>
>
> dev.off()
null device
1
>