R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Function to fit penalised regression

penReg

R Documentation

Function to fit penalised regression

Description

The function penReg() can be used to fit a P-spline. It can be used as demonstration of how the penalised B-splines can be fitted to one explanatory variable. For more that one explanatory variables use the function pb() in gamlss. The function penRegQ() is similar to the function penReg() but it estimates the "random effect" sigmas using the Q-function (marginal likelihood). The Q-function estimation takes longer but it has the advantage that standard errors are provided for log σ_e and log σ_b, where the sigmas are the standard errors for the response and the random effects respectively. The function pbq() is a smoother within GAMLSS and should give identical results to the additive function pb(). The function gamlss.pbq is not for use.

Usage

penReg(y, x, weights = rep(1, length(y)), df = NULL, lambda = NULL, start = 10, 
      inter = 20, order = 2, degree = 3,  plot = FALSE,
      method = c("ML", "ML-1", "GAIC", "GCV", "EM"), k = 2, ...)
penRegQ(y, x, weights = rep(1, length(y)), order = 2, start = 10, 
       plot = FALSE, lambda = NULL, inter = 20, degree = 3, 
       optim.proc = c("nlminb", "optim"), 
       optim.control = NULL)        
pbq(x, control = pbq.control(...), ...)
gamlss.pbq(x, y, w, xeval = NULL, ...)

Arguments

`y`	the response variable
`x`	the unique explanatory variable
`weights`	prior weights
`w`	weights in the iretation withing GAMLSS
`df`	effective degrees of freedom
`lambda`	the smoothing parameter
`start`	the lambda starting value if the local methods are used
`inter`	the no of break points (knots) in the x-axis
`order`	the required difference in the vector of coefficients
`degree`	the degree of the piecewise polynomial
`plot`	whether to plot the data and the fitted function
`method`	The method used in the (local) performance iterations. Available methods are "ML", "ML-1", "EM", "GAIC" and "GCV"
`k`	the penalty used in "GAIC" and "GCV"
`optim.proc`	which function to be use to optimise the Q-function, options are `c("nlminb", "optim")`
`optim.control`	options for the optimisation procedures
`control`	arguments for the fitting function. It takes one two: i) `order` the order of the B-spline and `plot` whether to plot the data and fit.
`xeval`	this is use for prediction
`...`	for extra arguments

Value

Returns a fitted object of class penReg. The object contains 1) the fitted coefficients 2) the fitted.values 3) the response variable y, 4) the label of the response variable ylabel 5) the explanatory variable x, 6) the lebel of the explanatory variable 7) the smoothing parameter lambda, 8) the effective degrees of freedom df, 9) the estimete for sigma sigma, 10) the residual sum of squares rss, 11) the Akaike information criterion aic, 12) the Bayesian information criterion sbc and 13) the deviance

Author(s)

Mikis Stasinopoulos d.stasinopoulos@londonmet.ac.uk, Bob Rigby r.rigby@londonmet.ac.uk and Paul Eilers

References

Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties (with comments and rejoinder). Statist. Sci, 11, 89-121.

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, http://www.jstatsoft.org/v23/i07.

Examples

set.seed(1234)
x <- seq(0,10,length=200); y<-(yt<-1+2*x+.6*x^2-.1*x^3)+rnorm(200, 4)
library(gamlss)
#------------------ 
# df fixed
g1<-gamlss(y~pb(x, df=4))
m1<-penReg(y,x, df=4) 
cbind(g1$mu.coefSmo[[1]]$lambda, m1$lambda)
cbind(g1$mu.df, m1$edf)
cbind(g1$aic, m1$aic)
cbind(fitted(g1), fitted(m1))[1:10,]
# identical
#------------------
# estimate lambda using ML
g2<-gamlss(y~pb(x))
m2<-penReg(y,x) 
cbind(g2$mu.df, m2$edf)
cbind(g2$mu.lambda, m2$lambda) 
cbind(g2$aic, m2$aic) # different lambda
cbind(fitted(g2), fitted(m2))[1:10,]
# identical
#------------------
#  estimate lambda using GCV
g3 <- gamlss(y~pb(x, method="GCV"))
m3 <- penReg(y,x, method="GCV") 
cbind(g3$mu.df, m3$edf)
cbind(g3$mu.lambda, m3$lambda)
cbind(g3$aic, m3$aic)
cbind(fitted(g3), fitted(m3))[1:10,]
# almost identical
#------------------
#  estimate lambda using  GAIC(#=3)
g4<-gamlss(y~pb(x, method="GAIC", k=3))
m4<-penReg(y,x, method="GAIC", k=3) 
cbind(g4$mu.df, m4$edf )
cbind(g4$mu.lambda, m4$lambda)
cbind(g4$aic, m4$aic)
cbind(g4$mu.df, m4$df)
cbind(g4$mu.lambda, m4$lambda)
cbind(fitted(g4), fitted(m4))[1:10,]

#-------------------
plot(y~x)
lines(fitted(m1)~x, col="green")
lines(fitted(m2)~x, col="red")
lines(fitted(m3)~x, col="blue")
lines(fitted(m4)~x, col="yellow")
lines(fitted(m4)~x, col="grey")
# using the Q function

# the Q-function takes longer
system.time(g6<-gamlss(y~pbq(x)))
system.time(g61<-gamlss(y~pb(x)))
AIC(g6, g61)
#
system.time(m6<-penRegQ(y,x))
system.time(m61<-penReg(y,x)) 
AIC(m6, m61)

cbind(g6$mu.df, g61$mu.df,m6$edf, m61$edf)
cbind(g6$mu.lambda,g61$mu.lambda, m6$lambda, m61$lambda) 
cbind(g6$aic, AIC(g6), m6$aic, AIC(m6), m61$aic, AIC(m61)) 
cbind(fitted(g6), fitted(m6))[1:10,]