a formula expression as for regression models, of
the form response ~ predictors; see formula.
"LL", "LT" or "TL" which stand for line-line,
line-threshold or threshold-line, defined below.
an optional data-frame that assigns values in
expression saying which subset of the data to use.
vector or matrix.
if TRUE then 'weights' specifies the inverse of the
weights vector or matrix, as for a covariance matrix.
is the variance known?
a function to filter missing data.
an optional list; see 'contrasts.arg' in
a constant vector to be subtracted from the
other arguments to lm.fit or
A broken-line model consists of two straight lines joined at a
changepoint. Three versions are
LL y = alpha + B * min(x - theta, 0) + Bp * max(x - theta, 0) + e
LT y = alpha + B * min(x - theta, 0) + e
TL y = alpha + Bp * max(x - theta, 0) + e
where e ~ Normal( 0, var * inv(weights) ). The LT and TL versions
omit 'alpha' if the formula is without intercept, such as 'y~x+0'.
Parameters 'theta', 'alpha', 'B', 'Bp', 'var' are unknown, but
'weights' is known.
The same models apply for a multiple-regression formula such as 'y ~ x1 +
x2 + ... + xn' where 'alpha' becomes the coefficient of the
"1"-vector and 'theta' the changepoint for the coefficient of the
first predictor term, 'x1'.
The test for the presence of a changepoint is by
a postulate value outside the range of 'x'-values. Thus, in the
LL model 'sl( min(x1) - 1 )' would give the exact significance
level of the null hypothesis "single line" versus the alternate
hypothesis "broken line."
Exact inferences about the changepoint
'theta' or '(theta,alpha)' are based on the distribution of its
likelihood-ratio statistic, conditional on sufficient statistics
for the other parameters. This method is called conditional likelihood-ratio (CLR) for short.
'lm.br' returns a list that includes a C++ object with accessor
functions. Functions sl, ci and cr get significance levels, confidence intervals,
and confidence regions for the changepoint's x-coordinate or
(x,y)-coordinates. Other functions are mle to get maximum likelihood estimates and sety to set new y-values.
The returned object also lists 'coefficients', 'fitted.values' and 'residuals', the same as for an 'lm' output list.
Data can include more than one 'y' value for the same 'x' value. The 'weights' matrix must be positive-definite.
If variance is known, then 'var' = 1 and 'weights' is the inverse of the variances
vector or variance-covariance matrix.
Knowles, M., Siegmund, D. and Zhang, H.P. (1991) Confidence regions
in semilinear regression, _Biometrika_, *78*, 15-31.
Siegmund, D. and Zhang, H.P. (1994), Confidence regions in
broken line regression, in "Change-point Problems", _IMS
Lecture Notes – Monograph Series_, *23*, eds. E. Carlstein, H.
Muller and D. Siegmund, Hayward, CA: Institute of Mathematical