The data as a list. All the elements required to compute the likelihood function
at a given parameter value should be included in this list.
thetainit
A numerical vector containing the initial value for the parameter of the model. It will be used
as starting point in the numerical optimization of the likelihood function.
floglik
A function which returns the log likelihood function at a given parameter value.
In particular, for a certain parameter value contained in a numerical vector theta,
a call floglik(theta, data) should return a scalar numerical value, the log likelihood
function at theta.
fscore
An optional function which returns the score function at a given parameter value. It must return a
numerical vector of the same length of thetainit. For a certain parameter value contained in a
numerical vector theta, a call fscore(theta, data) should return the gradient of the
log likelihood function at theta. Default is NULL, implying that numerical differentiation
will be employed.
fpsi
A function which specifies the parameter of interest. A call fpsi(theta) should return a scalar
value.
datagen
A function which simulates a data set. A call datagen(theta, data) will generate a copy of the
data list, with the values of the response variable replaced by a set of values simulated from
the parametric statistical model assumed for the response variable.
R
The number of Monte Carlo replicates used for computing the r* statistic. A positive integer, default is
1000.
seed
Optional positive integer, the random seed for the Monte Carlo computation. Default is NULL.
ronly
Logical. If set to TRUE the computation of the r* statistic will be skipped, and only the value
of the signed likelihood ratio test statistic r will be returned by the procedure, without any Monte
Carlo computation. Default is FALSE.
psidesc
An optional character string describing the nature of the parameter of interest. Default is NULL.
constr.opt
Constrained optimizer used for maximizing the log likelihood function under the null hypothesis. Possible
values are "solnp" or "alabama", with the former employing the solnp function from
package Rsolnp and the latter the constrOptim.nl from the package alabama. Defauls is
"solnp".
lower, upper
Optional numeric values defining the lower/upper limit of a grid of points for the parameter of interest,
where the r* statistic will be evaluated. Default is NULL.
control
A list of parameters for controlling the computation of confidence intervals. See rstar.ci.control.
...
Arguments to be used to form the default control argument if it is not supplied directly.
Details
The function obtains 90%, 95% and 99% two-sided confidence intervals for the scalar function of
interest based on the r* statistic.
The function requires the user to provide three functions defining the log likelihood function,
the scalar parametric function of interest, and a function for generating
a data set from the assumed statistical model. A further function returning the gradient of the
log likelihood is not required, but if provided it will speed up the computation.
When ronly = TRUE the function literally returns the value of the signed likelihood ratio test
statistic r only. The function handles also one-parameter models.
The function provides two different strategies to obtain the various confidence intervals. The default
strategy, invoked by leaving either lower or upper to NULL, starts from the MLE
and moves away in a stepwise fashion, until the r* statistic crosses the standard normal quantiles
corresponding to the 99% two-sided confidence interval. It is crucial to start the search a bit away
from the MLE, where the r* is singular, and this is regulated by the away argument of the
rstar.ci.control function. The first strategy may fail to cross the target normal quantiles when
the profile likelihood has an upper asymptote. For such cases, and for any other instances when the
output of the default strategy is deemed not satisfactory, it is possible to specify the
range of a grid of values where the r* statistic will be evaluated. The lower and upper
argument specify the lower and upper limit of such grid, whereas the number of points is controlled by the
npoints of the rstar.ci.control function.
Value
The returned value is an object of class "rstarci", containing the following components:
psivals
A list of values for the parameter of interest for which the r and r* statistics have been evaluated.
rvals
A numerical list containing the values of the r statistic evaluated at each element of psivals.
NPvals, INFvals
Numerical lists containing the values of the Nuisance Parameter adjustment (NP) and the Information
adjustment (INF) from the decomposition of the r*-r adjustment, for each of the psivals values.
Not computed when ronly = TRUE.
rsvals
The observed value of the r* statistic at each element of psivals. Not computed when ronly = TRUE.
CIr
A 3 x 2 matrix containing the 90%, 95% and 99% confidence intervals for the parameter of interest (first,
second and third row respectively) based on the first-order r statistic.
CIrs
A 3 x 2 matrix containing the 90%, 95% and 99% confidence intervals for the parameter of interest (first,
second and third row respectively) absed on the r* statistic. Not computed when ronly = TRUE.
seed
Random seed used for Monte Carlo replicates used for computing the r* statistic. Not returned when ronly = TRUE.
psidesc
A character string describing the nature of the parameter of interest.
R
Number of Monte Carlo replicates used for computing the r* statistic. Not returned when ronly = TRUE.
There are print, summary and plot methods for this class.
References
The method implemented in this function was proposed in
Skovgaard, I.M. (1996). An explicit large-deviation approximation to one-parameter tests.
Bernoulli, 2, 145–165.
For a general review
Severini, T.A. (2000). Likelihood Methods in Statistics. Oxford University Press.
See Also
rstar, rstar.ci.control.
Examples
# A negative binomial example, taken from Venables and Ripley (2002, MASS4 book)
library(MASS)
# The quine data are analysed in Section 7.4
data(quine)
# We fit a model with just the main effects
quine.nb1 <- glm.nb(Days ~ Eth + Sex + Age + Lrn, data = quine)
# The data list includes the design matrix and the response vector
quinedata <- list(X=model.matrix(quine.nb1), y=quine$Days)
# Let us define the required functions
# Log likelihood, log link
logLikNbin <- function(theta,data)
{
y <- data$y
X <- data$X
eta <- X %*% theta[1:ncol(X)]
mu <- exp(eta)
alpha <- theta[ncol(X)+1]
l <- sum(lgamma(y + alpha) + y * log(mu) - (alpha + y) * log(alpha + mu)
- lgamma(alpha) + alpha * log(alpha))
return(l)
}
# Score function
gradLikNbin <- function(theta,data)
{
y <- data$y
X <- data$X
eta <- X %*% theta[1:ncol(X)]
mu <- exp(eta)
alpha <- theta[ncol(X)+1]
g <-rep(0,ncol(X)+1)
g[1:ncol(X)] <- t(y - (alpha+y)*mu / (alpha+mu)) %*% X
g[ncol(X)+1] <- sum(digamma( y + alpha) - log(alpha + mu) - (alpha + y) / (alpha + mu)
- digamma(alpha) + 1 + log(alpha))
return(g)
}
# Data generator
genDataNbin <- function(theta,data)
{
out <- data
X <- data$X
eta<- X %*% theta[1:ncol(X)]
mu <- exp(eta)
out$y <- rnegbin(length(data$y), mu=mu, theta=theta[ncol(X)+1])
return(out)
}
# Confidence intervals for the coefficient of EthN
## Not run:
obj <- rstar.ci(quinedata, thetainit=c(coef(quine.nb1),quine.nb1$theta), floglik=logLikNbin,
datagen=genDataNbin, fscore=gradLikNbin, fpsi=function(theta) theta[2], R=1000,
psidesc="Coefficient of EthN")
print(obj)
summary(obj)
plot(obj)
# Confidence intervals for the overdispersion parameter
obj <- rstar.ci(quinedata, thetainit=c(coef(quine.nb1),quine.nb1$theta), floglik=logLikNbin,
datagen=genDataNbin, fscore=gradLikNbin, fpsi=function(theta) theta[8], R=1000,
psidesc="Overdispersion parameter")
summary(obj)
plot(obj)
## End(Not run)