Last data update: 2014.03.03

R: Frequency Table from a Copenhagen Housing Conditions Survey
housingR Documentation

Frequency Table from a Copenhagen Housing Conditions Survey

Description

The housing data frame has 72 rows and 5 variables.

Usage

housing

Format

Sat

Satisfaction of householders with their present housing circumstances, (High, Medium or Low, ordered factor).

Infl

Perceived degree of influence householders have on the management of the property (High, Medium, Low).

Type

Type of rental accommodation, (Tower, Atrium, Apartment, Terrace).

Cont

Contact residents are afforded with other residents, (Low, High).

Freq

Frequencies: the numbers of residents in each class.

Source

Madsen, M. (1976) Statistical analysis of multiple contingency tables. Two examples. Scand. J. Statist. 3, 97–106.

Cox, D. R. and Snell, E. J. (1984) Applied Statistics, Principles and Examples. Chapman & Hall.

References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

Examples

options(contrasts = c("contr.treatment", "contr.poly"))

# Surrogate Poisson models
house.glm0 <- glm(Freq ~ Infl*Type*Cont + Sat, family = poisson,
                  data = housing)
summary(house.glm0, cor = FALSE)

addterm(house.glm0, ~. + Sat:(Infl+Type+Cont), test = "Chisq")

house.glm1 <- update(house.glm0, . ~ . + Sat*(Infl+Type+Cont))
summary(house.glm1, cor = FALSE)

1 - pchisq(deviance(house.glm1), house.glm1$df.residual)

dropterm(house.glm1, test = "Chisq")

addterm(house.glm1, ~. + Sat:(Infl+Type+Cont)^2, test  =  "Chisq")

hnames <- lapply(housing[, -5], levels) # omit Freq
newData <- expand.grid(hnames)
newData$Sat <- ordered(newData$Sat)
house.pm <- predict(house.glm1, newData,
                    type = "response")  # poisson means
house.pm <- matrix(house.pm, ncol = 3, byrow = TRUE,
                   dimnames = list(NULL, hnames[[1]]))
house.pr <- house.pm/drop(house.pm %*% rep(1, 3))
cbind(expand.grid(hnames[-1]), round(house.pr, 2))

# Iterative proportional scaling
loglm(Freq ~ Infl*Type*Cont + Sat*(Infl+Type+Cont), data = housing)


# multinomial model
library(nnet)
(house.mult<- multinom(Sat ~ Infl + Type + Cont, weights = Freq,
                       data = housing))
house.mult2 <- multinom(Sat ~ Infl*Type*Cont, weights = Freq,
                        data = housing)
anova(house.mult, house.mult2)

house.pm <- predict(house.mult, expand.grid(hnames[-1]), type = "probs")
cbind(expand.grid(hnames[-1]), round(house.pm, 2))

# proportional odds model
house.cpr <- apply(house.pr, 1, cumsum)
logit <- function(x) log(x/(1-x))
house.ld <- logit(house.cpr[2, ]) - logit(house.cpr[1, ])
(ratio <- sort(drop(house.ld)))
mean(ratio)

(house.plr <- polr(Sat ~ Infl + Type + Cont,
                   data = housing, weights = Freq))

house.pr1 <- predict(house.plr, expand.grid(hnames[-1]), type = "probs")
cbind(expand.grid(hnames[-1]), round(house.pr1, 2))

Fr <- matrix(housing$Freq, ncol  =  3, byrow = TRUE)
2*sum(Fr*log(house.pr/house.pr1))

house.plr2 <- stepAIC(house.plr, ~.^2)
house.plr2$anova

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(MASS)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/MASS/housing.Rd_%03d_medium.png", width=480, height=480)
> ### Name: housing
> ### Title: Frequency Table from a Copenhagen Housing Conditions Survey
> ### Aliases: housing
> ### Keywords: datasets
> 
> ### ** Examples
> 
> options(contrasts = c("contr.treatment", "contr.poly"))
> 
> # Surrogate Poisson models
> house.glm0 <- glm(Freq ~ Infl*Type*Cont + Sat, family = poisson,
+                   data = housing)
> summary(house.glm0, cor = FALSE)

Call:
glm(formula = Freq ~ Infl * Type * Cont + Sat, family = poisson, 
    data = housing)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-4.5551  -1.0612  -0.0593   0.6483   4.1478  

Coefficients:
                                    Estimate Std. Error z value Pr(>|z|)    
(Intercept)                        3.136e+00  1.196e-01  26.225  < 2e-16 ***
InflMedium                         2.733e-01  1.586e-01   1.723 0.084868 .  
InflHigh                          -2.054e-01  1.784e-01  -1.152 0.249511    
TypeApartment                      3.666e-01  1.555e-01   2.357 0.018403 *  
TypeAtrium                        -7.828e-01  2.134e-01  -3.668 0.000244 ***
TypeTerrace                       -8.145e-01  2.157e-01  -3.775 0.000160 ***
ContHigh                          -1.490e-15  1.690e-01   0.000 1.000000    
Sat.L                              1.159e-01  4.038e-02   2.871 0.004094 ** 
Sat.Q                              2.629e-01  4.515e-02   5.824 5.76e-09 ***
InflMedium:TypeApartment          -1.177e-01  2.086e-01  -0.564 0.572571    
InflHigh:TypeApartment             1.753e-01  2.279e-01   0.769 0.441783    
InflMedium:TypeAtrium             -4.068e-01  3.035e-01  -1.340 0.180118    
InflHigh:TypeAtrium               -1.692e-01  3.294e-01  -0.514 0.607433    
InflMedium:TypeTerrace             6.292e-03  2.860e-01   0.022 0.982450    
InflHigh:TypeTerrace              -9.305e-02  3.280e-01  -0.284 0.776633    
InflMedium:ContHigh               -1.398e-01  2.279e-01  -0.613 0.539715    
InflHigh:ContHigh                 -6.091e-01  2.800e-01  -2.176 0.029585 *  
TypeApartment:ContHigh             5.029e-01  2.109e-01   2.385 0.017083 *  
TypeAtrium:ContHigh                6.774e-01  2.751e-01   2.462 0.013811 *  
TypeTerrace:ContHigh               1.099e+00  2.675e-01   4.106 4.02e-05 ***
InflMedium:TypeApartment:ContHigh  5.359e-02  2.862e-01   0.187 0.851450    
InflHigh:TypeApartment:ContHigh    1.462e-01  3.380e-01   0.432 0.665390    
InflMedium:TypeAtrium:ContHigh     1.555e-01  3.907e-01   0.398 0.690597    
InflHigh:TypeAtrium:ContHigh       4.782e-01  4.441e-01   1.077 0.281619    
InflMedium:TypeTerrace:ContHigh   -4.980e-01  3.671e-01  -1.357 0.174827    
InflHigh:TypeTerrace:ContHigh     -4.470e-01  4.545e-01  -0.984 0.325326    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 833.66  on 71  degrees of freedom
Residual deviance: 217.46  on 46  degrees of freedom
AIC: 610.43

Number of Fisher Scoring iterations: 5

> 
> addterm(house.glm0, ~. + Sat:(Infl+Type+Cont), test = "Chisq")
Single term additions

Model:
Freq ~ Infl * Type * Cont + Sat
         Df Deviance    AIC     LRT   Pr(Chi)    
<none>        217.46 610.43                      
Infl:Sat  4   111.08 512.05 106.371 < 2.2e-16 ***
Type:Sat  6   156.79 561.76  60.669 3.292e-11 ***
Cont:Sat  2   212.33 609.30   5.126   0.07708 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> house.glm1 <- update(house.glm0, . ~ . + Sat*(Infl+Type+Cont))
> summary(house.glm1, cor = FALSE)

Call:
glm(formula = Freq ~ Infl + Type + Cont + Sat + Infl:Type + Infl:Cont + 
    Type:Cont + Infl:Sat + Type:Sat + Cont:Sat + Infl:Type:Cont, 
    family = poisson, data = housing)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.6022  -0.5282  -0.0641   0.5757   1.9322  

Coefficients:
                                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)                        3.135074   0.120112  26.101  < 2e-16 ***
InflMedium                         0.248327   0.159979   1.552 0.120602    
InflHigh                          -0.412645   0.184947  -2.231 0.025671 *  
TypeApartment                      0.292524   0.157477   1.858 0.063231 .  
TypeAtrium                        -0.792847   0.214413  -3.698 0.000218 ***
TypeTerrace                       -1.018074   0.221263  -4.601 4.20e-06 ***
ContHigh                          -0.001407   0.169711  -0.008 0.993385    
Sat.L                             -0.098106   0.112592  -0.871 0.383570    
Sat.Q                              0.285657   0.122283   2.336 0.019489 *  
InflMedium:TypeApartment          -0.017882   0.210496  -0.085 0.932302    
InflHigh:TypeApartment             0.386869   0.233297   1.658 0.097263 .  
InflMedium:TypeAtrium             -0.360311   0.304979  -1.181 0.237432    
InflHigh:TypeAtrium               -0.036788   0.334793  -0.110 0.912503    
InflMedium:TypeTerrace             0.185154   0.288892   0.641 0.521580    
InflHigh:TypeTerrace               0.310749   0.334815   0.928 0.353345    
InflMedium:ContHigh               -0.200060   0.228748  -0.875 0.381799    
InflHigh:ContHigh                 -0.725790   0.282352  -2.571 0.010155 *  
TypeApartment:ContHigh             0.569691   0.212152   2.685 0.007247 ** 
TypeAtrium:ContHigh                0.702115   0.276056   2.543 0.010979 *  
TypeTerrace:ContHigh               1.215930   0.269968   4.504 6.67e-06 ***
InflMedium:Sat.L                   0.519627   0.096830   5.366 8.03e-08 ***
InflHigh:Sat.L                     1.140302   0.118180   9.649  < 2e-16 ***
InflMedium:Sat.Q                  -0.064474   0.102666  -0.628 0.530004    
InflHigh:Sat.Q                     0.115436   0.127798   0.903 0.366380    
TypeApartment:Sat.L               -0.520170   0.109793  -4.738 2.16e-06 ***
TypeAtrium:Sat.L                  -0.288484   0.149551  -1.929 0.053730 .  
TypeTerrace:Sat.L                 -0.998666   0.141527  -7.056 1.71e-12 ***
TypeApartment:Sat.Q                0.055418   0.118515   0.468 0.640068    
TypeAtrium:Sat.Q                  -0.273820   0.149713  -1.829 0.067405 .  
TypeTerrace:Sat.Q                 -0.032328   0.149251  -0.217 0.828520    
ContHigh:Sat.L                     0.340703   0.087778   3.881 0.000104 ***
ContHigh:Sat.Q                    -0.097929   0.094068  -1.041 0.297851    
InflMedium:TypeApartment:ContHigh  0.046900   0.286212   0.164 0.869837    
InflHigh:TypeApartment:ContHigh    0.126229   0.338208   0.373 0.708979    
InflMedium:TypeAtrium:ContHigh     0.157239   0.390719   0.402 0.687364    
InflHigh:TypeAtrium:ContHigh       0.478611   0.444244   1.077 0.281320    
InflMedium:TypeTerrace:ContHigh   -0.500162   0.367135  -1.362 0.173091    
InflHigh:TypeTerrace:ContHigh     -0.463099   0.454713  -1.018 0.308467    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 833.657  on 71  degrees of freedom
Residual deviance:  38.662  on 34  degrees of freedom
AIC: 455.63

Number of Fisher Scoring iterations: 4

> 
> 1 - pchisq(deviance(house.glm1), house.glm1$df.residual)
[1] 0.2671363
> 
> dropterm(house.glm1, test = "Chisq")
Single term deletions

Model:
Freq ~ Infl + Type + Cont + Sat + Infl:Type + Infl:Cont + Type:Cont + 
    Infl:Sat + Type:Sat + Cont:Sat + Infl:Type:Cont
               Df Deviance    AIC     LRT   Pr(Chi)    
<none>              38.662 455.63                      
Infl:Sat        4  147.780 556.75 109.117 < 2.2e-16 ***
Type:Sat        6  100.889 505.86  62.227 1.586e-11 ***
Cont:Sat        2   54.722 467.69  16.060 0.0003256 ***
Infl:Type:Cont  6   43.952 448.92   5.290 0.5072454    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> addterm(house.glm1, ~. + Sat:(Infl+Type+Cont)^2, test  =  "Chisq")
Single term additions

Model:
Freq ~ Infl + Type + Cont + Sat + Infl:Type + Infl:Cont + Type:Cont + 
    Infl:Sat + Type:Sat + Cont:Sat + Infl:Type:Cont
              Df Deviance    AIC     LRT Pr(Chi)  
<none>             38.662 455.63                  
Infl:Type:Sat 12   16.107 457.08 22.5550 0.03175 *
Infl:Cont:Sat  4   37.472 462.44  1.1901 0.87973  
Type:Cont:Sat  6   28.256 457.23 10.4064 0.10855  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> hnames <- lapply(housing[, -5], levels) # omit Freq
> newData <- expand.grid(hnames)
> newData$Sat <- ordered(newData$Sat)
> house.pm <- predict(house.glm1, newData,
+                     type = "response")  # poisson means
> house.pm <- matrix(house.pm, ncol = 3, byrow = TRUE,
+                    dimnames = list(NULL, hnames[[1]]))
> house.pr <- house.pm/drop(house.pm %*% rep(1, 3))
> cbind(expand.grid(hnames[-1]), round(house.pr, 2))
     Infl      Type Cont  Low Medium High
1     Low     Tower  Low 0.40   0.26 0.34
2  Medium     Tower  Low 0.26   0.27 0.47
3    High     Tower  Low 0.15   0.19 0.66
4     Low Apartment  Low 0.54   0.23 0.23
5  Medium Apartment  Low 0.39   0.26 0.34
6    High Apartment  Low 0.26   0.21 0.53
7     Low    Atrium  Low 0.43   0.32 0.25
8  Medium    Atrium  Low 0.30   0.35 0.36
9    High    Atrium  Low 0.19   0.27 0.54
10    Low   Terrace  Low 0.65   0.22 0.14
11 Medium   Terrace  Low 0.51   0.27 0.22
12   High   Terrace  Low 0.37   0.24 0.39
13    Low     Tower High 0.30   0.28 0.42
14 Medium     Tower High 0.18   0.27 0.54
15   High     Tower High 0.10   0.19 0.71
16    Low Apartment High 0.44   0.27 0.30
17 Medium Apartment High 0.30   0.28 0.42
18   High Apartment High 0.18   0.21 0.61
19    Low    Atrium High 0.33   0.36 0.31
20 Medium    Atrium High 0.22   0.36 0.42
21   High    Atrium High 0.13   0.27 0.60
22    Low   Terrace High 0.55   0.27 0.19
23 Medium   Terrace High 0.40   0.31 0.29
24   High   Terrace High 0.27   0.26 0.47
> 
> # Iterative proportional scaling
> loglm(Freq ~ Infl*Type*Cont + Sat*(Infl+Type+Cont), data = housing)
Call:
loglm(formula = Freq ~ Infl * Type * Cont + Sat * (Infl + Type + 
    Cont), data = housing)

Statistics:
                      X^2 df  P(> X^2)
Likelihood Ratio 38.66222 34 0.2671359
Pearson          38.90831 34 0.2582333
> 
> 
> # multinomial model
> library(nnet)
> (house.mult<- multinom(Sat ~ Infl + Type + Cont, weights = Freq,
+                        data = housing))
# weights:  24 (14 variable)
initial  value 1846.767257 
iter  10 value 1747.045232
final  value 1735.041933 
converged
Call:
multinom(formula = Sat ~ Infl + Type + Cont, data = housing, 
    weights = Freq)

Coefficients:
       (Intercept) InflMedium  InflHigh TypeApartment TypeAtrium TypeTerrace
Medium  -0.4192316  0.4464003 0.6649367    -0.4356851  0.1313663  -0.6665728
High    -0.1387453  0.7348626 1.6126294    -0.7356261 -0.4079808  -1.4123333
        ContHigh
Medium 0.3608513
High   0.4818236

Residual Deviance: 3470.084 
AIC: 3498.084 
> house.mult2 <- multinom(Sat ~ Infl*Type*Cont, weights = Freq,
+                         data = housing)
# weights:  75 (48 variable)
initial  value 1846.767257 
iter  10 value 1734.465581
iter  20 value 1717.220153
iter  30 value 1715.760679
iter  40 value 1715.713306
final  value 1715.710836 
converged
> anova(house.mult, house.mult2)
Likelihood ratio tests of Multinomial Models

Response: Sat
               Model Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
1 Infl + Type + Cont       130   3470.084                                
2 Infl * Type * Cont        96   3431.422 1 vs 2    34 38.66219 0.2671367
> 
> house.pm <- predict(house.mult, expand.grid(hnames[-1]), type = "probs")
> cbind(expand.grid(hnames[-1]), round(house.pm, 2))
     Infl      Type Cont  Low Medium High
1     Low     Tower  Low 0.40   0.26 0.34
2  Medium     Tower  Low 0.26   0.27 0.47
3    High     Tower  Low 0.15   0.19 0.66
4     Low Apartment  Low 0.54   0.23 0.23
5  Medium Apartment  Low 0.39   0.26 0.34
6    High Apartment  Low 0.26   0.21 0.53
7     Low    Atrium  Low 0.43   0.32 0.25
8  Medium    Atrium  Low 0.30   0.35 0.36
9    High    Atrium  Low 0.19   0.27 0.54
10    Low   Terrace  Low 0.65   0.22 0.14
11 Medium   Terrace  Low 0.51   0.27 0.22
12   High   Terrace  Low 0.37   0.24 0.39
13    Low     Tower High 0.30   0.28 0.42
14 Medium     Tower High 0.18   0.27 0.54
15   High     Tower High 0.10   0.19 0.71
16    Low Apartment High 0.44   0.27 0.30
17 Medium Apartment High 0.30   0.28 0.42
18   High Apartment High 0.18   0.21 0.61
19    Low    Atrium High 0.33   0.36 0.31
20 Medium    Atrium High 0.22   0.36 0.42
21   High    Atrium High 0.13   0.27 0.60
22    Low   Terrace High 0.55   0.27 0.19
23 Medium   Terrace High 0.40   0.31 0.29
24   High   Terrace High 0.27   0.26 0.47
> 
> # proportional odds model
> house.cpr <- apply(house.pr, 1, cumsum)
> logit <- function(x) log(x/(1-x))
> house.ld <- logit(house.cpr[2, ]) - logit(house.cpr[1, ])
> (ratio <- sort(drop(house.ld)))
 [1] 0.9357341 0.9854433 1.0573182 1.0680491 1.0772649 1.0803574 1.0824895
 [8] 1.0998759 1.1199975 1.1554228 1.1768138 1.1866427 1.2091541 1.2435026
[15] 1.2724096 1.2750171 1.2849903 1.3062598 1.3123988 1.3904715 1.4540087
[22] 1.4947753 1.4967585 1.6068789
> mean(ratio)
[1] 1.223835
> 
> (house.plr <- polr(Sat ~ Infl + Type + Cont,
+                    data = housing, weights = Freq))
Call:
polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)

Coefficients:
   InflMedium      InflHigh TypeApartment    TypeAtrium   TypeTerrace 
    0.5663937     1.2888191    -0.5723501    -0.3661866    -1.0910149 
     ContHigh 
    0.3602841 

Intercepts:
 Low|Medium Medium|High 
 -0.4961353   0.6907083 

Residual Deviance: 3479.149 
AIC: 3495.149 
> 
> house.pr1 <- predict(house.plr, expand.grid(hnames[-1]), type = "probs")
> cbind(expand.grid(hnames[-1]), round(house.pr1, 2))
     Infl      Type Cont  Low Medium High
1     Low     Tower  Low 0.38   0.29 0.33
2  Medium     Tower  Low 0.26   0.27 0.47
3    High     Tower  Low 0.14   0.21 0.65
4     Low Apartment  Low 0.52   0.26 0.22
5  Medium Apartment  Low 0.38   0.29 0.33
6    High Apartment  Low 0.23   0.26 0.51
7     Low    Atrium  Low 0.47   0.27 0.26
8  Medium    Atrium  Low 0.33   0.29 0.38
9    High    Atrium  Low 0.19   0.25 0.56
10    Low   Terrace  Low 0.64   0.21 0.14
11 Medium   Terrace  Low 0.51   0.26 0.23
12   High   Terrace  Low 0.33   0.29 0.38
13    Low     Tower High 0.30   0.28 0.42
14 Medium     Tower High 0.19   0.25 0.56
15   High     Tower High 0.10   0.17 0.72
16    Low Apartment High 0.43   0.28 0.29
17 Medium Apartment High 0.30   0.28 0.42
18   High Apartment High 0.17   0.23 0.60
19    Low    Atrium High 0.38   0.29 0.33
20 Medium    Atrium High 0.26   0.27 0.47
21   High    Atrium High 0.14   0.21 0.64
22    Low   Terrace High 0.56   0.25 0.19
23 Medium   Terrace High 0.42   0.28 0.30
24   High   Terrace High 0.26   0.27 0.47
> 
> Fr <- matrix(housing$Freq, ncol  =  3, byrow = TRUE)
> 2*sum(Fr*log(house.pr/house.pr1))
[1] 9.065433
> 
> house.plr2 <- stepAIC(house.plr, ~.^2)
Start:  AIC=3495.15
Sat ~ Infl + Type + Cont

            Df    AIC
+ Infl:Type  6 3484.6
+ Type:Cont  3 3492.5
<none>         3495.1
+ Infl:Cont  2 3498.9
- Cont       1 3507.5
- Type       3 3545.1
- Infl       2 3599.4

Step:  AIC=3484.64
Sat ~ Infl + Type + Cont + Infl:Type

            Df    AIC
+ Type:Cont  3 3482.7
<none>         3484.6
+ Infl:Cont  2 3488.5
- Infl:Type  6 3495.1
- Cont       1 3497.8

Step:  AIC=3482.69
Sat ~ Infl + Type + Cont + Infl:Type + Type:Cont

            Df    AIC
<none>         3482.7
- Type:Cont  3 3484.6
+ Infl:Cont  2 3486.6
- Infl:Type  6 3492.5
> house.plr2$anova
Stepwise Model Path 
Analysis of Deviance Table

Initial Model:
Sat ~ Infl + Type + Cont

Final Model:
Sat ~ Infl + Type + Cont + Infl:Type + Type:Cont


         Step Df  Deviance Resid. Df Resid. Dev      AIC
1                               1673   3479.149 3495.149
2 + Infl:Type  6 22.509347      1667   3456.640 3484.640
3 + Type:Cont  3  7.945029      1664   3448.695 3482.695
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>