Probit regression : generic synthetic binary/binomial probit data and model
Description
probit_syn is a generic function for developing synthetic probit regression data and
a model given user defined specifications.
Usage
probit_syn(nobs=50000, d=1, xv = c(1, 0.5, -1.5))
Arguments
nobs
number of observations in model, Default is 50000
d
binomial denominator, Default is 1, a binary probit model. May
use a variable containing different denominator values.
xv
predictor coefficient values. First argument is intercept. Use as
xv = c(intercept , x1_coef, x2_coef, ...)
Details
Create a synthetic probit regression model using the appropriate arguments.
Binomial denominator must be declared. For a binary probit model, d=1. A
variable may be used as the denominator when values differ. See examples.
Value
py
binomial probit numerator; number of successes
sim.data
synthetic data set
Author(s)
Joseph M. Hilbe, Arizona State University, and
Jet Propulsion Laboratory, California Institute of Technology
Andrew Robinson, Universty of Melbourne, Australia.
References
Hilbe, J.M. (2011), Negative Binomial Regression, second edition, Cambridge University Press.
Hilbe, J.M. (2009), Logistic Regression Models, Chapman & Hall/CRCD
See Also
logit_syn
Examples
# Binary probit regression (denominator=1)
sim.data <-probit_syn(nobs = 5000, d = 1, xv = c(1, .5, -1.5))
myprobit <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
summary(myprobit)
confint(myprobit)
# Binary probit regression with 3 predictors (denominator=1)
sim.data <-probit_syn(nobs = 5000, d = 1, xv = c(1, .75, -1.5, 1.15))
myprobit <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
summary(myprobit)
confint(myprobit)
# Binomial or grouped probit regression with defined denominator, den
den <- rep(1:5, each=1000, times=1)*100
sim.data <- probit_syn(nobs = 5000, d = den, xv = c(1, .5, -1.5))
gpy <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
summary(gpy)
## Not run:
# default
sim.data <- probit_syn()
dprobit <- glm(cbind(py,dpy) ~ . , family=binomial(link="probit"), data = sim.data)
summary(dprobit)
## End(Not run)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(COUNT)
Loading required package: msme
Loading required package: MASS
Loading required package: lattice
Loading required package: sandwich
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/COUNT/probit_syn.Rd_%03d_medium.png", width=480, height=480)
> ### Name: probit_syn
> ### Title: Probit regression : generic synthetic binary/binomial probit
> ### data and model
> ### Aliases: probit_syn
> ### Keywords: models probit binomial
>
> ### ** Examples
>
>
> # Binary probit regression (denominator=1)
> sim.data <-probit_syn(nobs = 5000, d = 1, xv = c(1, .5, -1.5))
> myprobit <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
> summary(myprobit)
Call:
glm(formula = cbind(py, dpy) ~ ., family = binomial(link = "probit"),
data = sim.data)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.3332 -0.2948 0.1462 0.5343 2.6474
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.01650 0.03076 33.05 <2e-16 ***
x1 0.47726 0.02640 18.08 <2e-16 ***
x2 -1.47297 0.04107 -35.86 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 6016.1 on 4999 degrees of freedom
Residual deviance: 3324.8 on 4997 degrees of freedom
AIC: 3330.8
Number of Fisher Scoring iterations: 6
> confint(myprobit)
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 0.9567291 1.0775849
x1 0.4257506 0.5294949
x2 -1.5547389 -1.3934760
>
> # Binary probit regression with 3 predictors (denominator=1)
> sim.data <-probit_syn(nobs = 5000, d = 1, xv = c(1, .75, -1.5, 1.15))
> myprobit <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred
> summary(myprobit)
Call:
glm(formula = cbind(py, dpy) ~ ., family = binomial(link = "probit"),
data = sim.data)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.05871 -0.29767 0.06173 0.43598 3.06971
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.95128 0.03301 28.82 <2e-16 ***
x1 0.72537 0.03171 22.87 <2e-16 ***
x2 -1.48371 0.04366 -33.98 <2e-16 ***
x3 1.16489 0.03812 30.56 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 6398.3 on 4999 degrees of freedom
Residual deviance: 2955.6 on 4996 degrees of freedom
AIC: 2963.6
Number of Fisher Scoring iterations: 7
> confint(myprobit)
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 0.8874648 1.0165626
x1 0.6638559 0.7881303
x2 -1.5708089 -1.3991936
x3 1.0908618 1.2409113
There were 24 warnings (use warnings() to see them)
>
> # Binomial or grouped probit regression with defined denominator, den
> den <- rep(1:5, each=1000, times=1)*100
> sim.data <- probit_syn(nobs = 5000, d = den, xv = c(1, .5, -1.5))
> gpy <- glm(cbind(py,dpy) ~ ., family=binomial(link="probit"), data = sim.data)
> summary(gpy)
Call:
glm(formula = cbind(py, dpy) ~ ., family = binomial(link = "probit"),
data = sim.data)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.6266 -0.6203 0.0643 0.6594 3.3104
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.999403 0.001785 560.0 <2e-16 ***
x1 0.499959 0.001571 318.2 <2e-16 ***
x2 -1.501480 0.002369 -633.7 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 870025.3 on 4999 degrees of freedom
Residual deviance: 4730.2 on 4997 degrees of freedom
AIC: 25374
Number of Fisher Scoring iterations: 4
>
> ## Not run:
> ##D # default
> ##D sim.data <- probit_syn()
> ##D dprobit <- glm(cbind(py,dpy) ~ . , family=binomial(link="probit"), data = sim.data)
> ##D summary(dprobit)
> ## End(Not run)
>
>
>
>
>
>
> dev.off()
null device
1
>