R: Change point detection and estimation in genomic sequences
cumSeg-package
R Documentation
Change point detection and estimation in genomic sequences
Description
Estimation of number and location of change points in ‘mean-shift’
(‘piecewise constant’ or ‘step-function’) models.
Particularly useful to model genomic sequences of continuous measurements.
Details
Package:
cumSeg
Type:
Package
Version:
1.1
Date:
2011-10-14
License:
GPL
LazyLoad:
yes
Package cumSeg estimates the number and location of change points in ‘mean-shift’
(also said ‘piecewise constant’ or ‘step-function’) models. These models are particularly useful in Biology where
it is of interest to know the location of some genomic sequences (e.g. in array comparative genomic hybridization analysis).
The algorithm works by first estimating an high number of change points (via the efficient ‘segmented’ algorithm of Muggeo (2003))
and then by applying the lars algorithm of Efron et al. (2004) to select some of them via a generalized BIC criterion.
The procedure appears to be robust to model mis-specifications and
from a computational standpoint, it is substantially independent of the number of change points to be estimated.
Muggeo, V.M.R., Adelfio, G., Efficient change point detection for genomic sequences
of continuous measurements, Bioinformatics27, 161-166.
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R. (2004) Least angle regression,
Annals of Statistics32, 407-489.
Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points.
Statistics in Medicine22, 3055-3071.
See Also
DNAcopy tilingArray
Examples
## Not run:
library(cumSeg)
data(fibroblast)
#select chromosomes 1.. but the same for chromosomes 3,9,11
z<-na.omit(fibroblast$gm03563[fibroblast$Chromosome==1])
o<-jumpoints(z,k=30,output="3")
plot(z)
plot(o,add=TRUE,y=FALSE,col=4)
## End(Not run)