Last data update: 2014.03.03

R: Probit regression : generic synthetic binary/binomial probit...
probit_synR Documentation

Probit regression : generic synthetic binary/binomial probit data and model

Description

probit_syn is a generic function for developing synthetic probit regression data and a model given user defined specifications.

Usage

probit_syn(nobs=50000, d=1,  xv = c(1, 0.5, -1.5))

Arguments

nobs

number of observations in model, Default is 50000

d

binomial denominator, Default is 1, a binary probit model. May use a variable containing different denominator values.

xv

predictor coefficient values. First argument is intercept. Use as xv = c(intercept , x1_coef, x2_coef, ...)

Details

Create a synthetic probit regression model using the appropriate arguments. Binomial denominator must be declared. For a binary probit model, d=1. A variable may be used as the denominator when values differ. See examples.

Value

py

binomial probit numerator; number of successes

sim.data

synthetic data set

Author(s)

Joseph M. Hilbe, Arizona State University, and Jet Propulsion Laboratory, California Institute of Technology Andrew Robinson, Universty of Melbourne, Australia.

References

Hilbe, J.M. (2011), Negative Binomial Regression, second edition, Cambridge University Press. Hilbe, J.M. (2009), Logistic Regression Models, Chapman & Hall/CRCD

See Also

logit_syn

Examples


# Binary probit regression (denominator=1)
sim.data <-probit_syn(nobs = 5000, d = 1, xv = c(1, .5, -1.5))
myprobit <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
summary(myprobit)
confint(myprobit)

# Binary probit regression with 3 predictors (denominator=1)
sim.data <-probit_syn(nobs = 5000, d = 1, xv = c(1, .75, -1.5, 1.15))
myprobit <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
summary(myprobit)
confint(myprobit)

# Binomial or grouped probit regression with defined denominator, den
den <- rep(1:5, each=1000, times=1)*100
sim.data <- probit_syn(nobs = 5000, d = den, xv = c(1, .5, -1.5))
gpy <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
summary(gpy)

## Not run: 
# default
sim.data <- probit_syn()
dprobit <- glm(cbind(py,dpy) ~ . , family=binomial(link="probit"), data = sim.data)
summary(dprobit)

## End(Not run)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(COUNT)
Loading required package: msme
Loading required package: MASS
Loading required package: lattice
Loading required package: sandwich
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/COUNT/probit_syn.Rd_%03d_medium.png", width=480, height=480)
> ### Name: probit_syn
> ### Title: Probit regression : generic synthetic binary/binomial probit
> ###   data and model
> ### Aliases: probit_syn
> ### Keywords: models probit binomial
> 
> ### ** Examples
> 
> 
> # Binary probit regression (denominator=1)
> sim.data <-probit_syn(nobs = 5000, d = 1, xv = c(1, .5, -1.5))
> myprobit <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
> summary(myprobit)

Call:
glm(formula = cbind(py, dpy) ~ ., family = binomial(link = "probit"), 
    data = sim.data)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.3332  -0.2948   0.1462   0.5343   2.6474  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.01650    0.03076   33.05   <2e-16 ***
x1           0.47726    0.02640   18.08   <2e-16 ***
x2          -1.47297    0.04107  -35.86   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 6016.1  on 4999  degrees of freedom
Residual deviance: 3324.8  on 4997  degrees of freedom
AIC: 3330.8

Number of Fisher Scoring iterations: 6

> confint(myprobit)
Waiting for profiling to be done...
                 2.5 %     97.5 %
(Intercept)  0.9567291  1.0775849
x1           0.4257506  0.5294949
x2          -1.5547389 -1.3934760
> 
> # Binary probit regression with 3 predictors (denominator=1)
> sim.data <-probit_syn(nobs = 5000, d = 1, xv = c(1, .75, -1.5, 1.15))
> myprobit <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred 
> summary(myprobit)

Call:
glm(formula = cbind(py, dpy) ~ ., family = binomial(link = "probit"), 
    data = sim.data)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-3.05871  -0.29767   0.06173   0.43598   3.06971  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.95128    0.03301   28.82   <2e-16 ***
x1           0.72537    0.03171   22.87   <2e-16 ***
x2          -1.48371    0.04366  -33.98   <2e-16 ***
x3           1.16489    0.03812   30.56   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 6398.3  on 4999  degrees of freedom
Residual deviance: 2955.6  on 4996  degrees of freedom
AIC: 2963.6

Number of Fisher Scoring iterations: 7

> confint(myprobit)
Waiting for profiling to be done...
                 2.5 %     97.5 %
(Intercept)  0.8874648  1.0165626
x1           0.6638559  0.7881303
x2          -1.5708089 -1.3991936
x3           1.0908618  1.2409113
There were 24 warnings (use warnings() to see them)
> 
> # Binomial or grouped probit regression with defined denominator, den
> den <- rep(1:5, each=1000, times=1)*100
> sim.data <- probit_syn(nobs = 5000, d = den, xv = c(1, .5, -1.5))
> gpy <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
> summary(gpy)

Call:
glm(formula = cbind(py, dpy) ~ ., family = binomial(link = "probit"), 
    data = sim.data)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.6266  -0.6203   0.0643   0.6594   3.3104  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.999403   0.001785   560.0   <2e-16 ***
x1           0.499959   0.001571   318.2   <2e-16 ***
x2          -1.501480   0.002369  -633.7   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 870025.3  on 4999  degrees of freedom
Residual deviance:   4730.2  on 4997  degrees of freedom
AIC: 25374

Number of Fisher Scoring iterations: 4

> 
> ## Not run: 
> ##D # default
> ##D sim.data <- probit_syn()
> ##D dprobit <- glm(cbind(py,dpy) ~ . , family=binomial(link="probit"), data = sim.data)
> ##D summary(dprobit)
> ## End(Not run)
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>