Last data update: 2014.03.03

R: Fits a Bayesian mixture model using Markov Chain Monte Carlo...
bayesMixModelR Documentation

Fits a Bayesian mixture model using Markov Chain Monte Carlo (MCMC) methods

Description

This method estimates the posterior distribution of a Bayesian mixture model using Markov Chain Monte Carlo (MCMC) methods and calculates measures of this distribution. The mixture model may consist of normal components (with a fixed expectation of 0), exponential components and gamma components, which may be mirrored in order to model negative values.

Usage

bayesMixModel(z, normNull=c(), expNeg=c(), expPos=c(), gamNeg=c(), gamPos=c(), sdNormNullInit=c(), rateExpNegInit=c(), rateExpPosInit=c(), shapeGamNegInit=c(), scaleGamNegInit=c(), shapeGamPosInit=c(), scaleGamPosInit=c(), piInit, classificationsInit, dirichletParInit=1, shapeDir=1, scaleDir=1, weightsPrior="FDD", sdAlpha, shapeNorm0=c(), scaleNorm0=c(), shapeExpNeg0=c(), scaleExpNeg0=c(), shapeExpPos0=c(), scaleExpPos0=c(), shapeGamNegAlpha0=c(), shapeGamNegBeta0=c(), scaleGamNegAlpha0=c(), scaleGamNegBeta0=c(), shapeGamPosAlpha0=c(), shapeGamPosBeta0=c(), scaleGamPosAlpha0=c(), scaleGamPosBeta0=c(), itb, nmc, thin, average="mean",sdShape)

Arguments

z

Observed values

normNull

Indices of the normal components (that have mu=0).

expNeg

Indices of the mirrored exponential components.

expPos

Indices of the exponential components.

gamNeg

Indices of the mirrored gamma components.

gamPos

Indices of the gamma components.

sdNormNullInit

Initial standard deviations of the normal components.

rateExpNegInit

Initial rates of the mirrored exponential components. Only relevant if mirrored exponential components are specified.

rateExpPosInit

Initial rates of the exponential components. Only relevant if exponential components are specified.

shapeGamNegInit

Initial shape parameters of the mirrored gamma components. Only relevant if mirrored Gamma components are specified.

scaleGamNegInit

Initial scale parameters of the mirrored gamma components. Only relevant if mirrored Gamma components are specified.

shapeGamPosInit

Initial shape parameters of the gamma components. Only relevant if Gamma components are specified.

scaleGamPosInit

Initial scale parameters of the gamma components. Only relevant if Gamma components are specified.

piInit

Initial weights of the components. If missing, all k components get the same initial weight 1/k.

classificationsInit

Initial classifications of the data points. If missing, all data points are assigned to class floor(k/2) with k = number of components.

dirichletParInit

Initial concentration parameter of prior distribution assigned to the mixture weights.

shapeDir

Prior shape parameter of Gamma distribution for concentration parameter of prior distribution assigned to the mixture weights.

scaleDir

Prior scale parameter of Gamma distribution for concentration parameter of prior distribution assigned to the mixture weights.

weightsPrior

Prior distribution assigned to mixture weights. Available are the Finite-dimensional Dirichlet prior ("FDD"), also known as Dirichlet-multinomial process, and the Truncated Dirichlet process ("TDP"). Both are approximations to the Dirichlet process for a large number of components, while the Finite-dimensional Dirichlet prior is also suited for a small number of components as a special case of the Dirichlet distribution.

sdAlpha

Standard deviation of proposal distribution for concentration parameter of the prior distribution assigned to the mixture weights in the Metropolis-Hastings step incorporated in the Gibbs sampler. Only relevant if weightsPrior="FDD".

shapeNorm0

Prior shape parameter of Gamma distribution for precision of normal components.

scaleNorm0

Prior scale parameter of Gamma distribution for precision of normal components.

shapeExpNeg0

Prior shape parameter of Gamma distribution for parameter of mirrored exponential components. Only relevant if mirrored exponential components are specified.

scaleExpNeg0

Prior scale parameter of Gamma distribution for parameter of mirrored exponential components. Only relevant if mirrored exponential components are specified.

shapeExpPos0

Prior shape parameter of Gamma distribution for parameter of exponential components. Only relevant if exponential components are specified.

scaleExpPos0

Prior scale parameter of Gamma distribution for parameter of exponential components. Only relevant if exponential components are specified.

shapeGamNegAlpha0

Prior shape parameter of Gamma distribution for shape parameter of mirrored Gamma components. Only relevant if mirrored Gamma components are specified.

shapeGamNegBeta0

Prior scale parameter of Gamma distribution for shape parameter of mirrored Gamma components. Only relevant if mirrored Gamma components are specified.

scaleGamNegAlpha0

Prior shape parameter of Gamma distribution for scale parameter of mirrored Gamma components. Only relevant if mirrored Gamma components are specified.

scaleGamNegBeta0

Prior scale parameter of Gamma distribution for scale parameter of mirrored Gamma components. Only relevant if mirrored Gamma components are specified.

shapeGamPosAlpha0

Prior shape parameter of Gamma distribution for shape parameter of Gamma components. Only relevant if Gamma components are specified.

shapeGamPosBeta0

Prior scale parameter of Gamma distribution for shape parameter of Gamma components. Only relevant if Gamma components are specified.

scaleGamPosAlpha0

Prior shape parameter of Gamma distribution for scale parameter of Gamma components. Only relevant if Gamma components are specified.

scaleGamPosBeta0

Prior scale parameter of Gamma distribution for scale parameter of Gamma components. Only relevant if Gamma components are specified.

itb

Number of iterations used for burn-in.

nmc

Number of iterations after burn-in used for analysis.

thin

Thinning value for the iterations after burn-in.

average

Way of averaging across the posterior distribution to obtain estimates of model parameters. Either average="mean" or average="median". Note: For the allocation to components, results are given for posterior mean, median and maximum density regardless of the specification.

sdShape

Standard deviation of proposal distribution for shape parameter of Gamma components in the Metropolis-Hastings step incorporated in the Gibbs sampler. Only relevant if Gamma components are specified.

Details

The convergence of Markov chains must be assessed prior to an interpretation of results. Inspection of trace plots via plotChains is therefore urgently recommended. Iterations during which one of the chains has not yet reached stationarity should not be taken into account for analysis and can be excluded by setting an appropriate burn-in value itb. Autocorrelation between subsequent chain values can be reduced by thinning the chain, setting an appropriate value for thin. To ensure a sufficient number of iterations for the chains after the burn-in, nmc should be increased when the thinning is increased. The standard deviations of the proposal distribution in a Metropolis-Hastings step should be tuned to achieve a medium-level acceptance rate (e.g., 0.3-0.7): A very low acceptance rate would cause a long running time of the algorithm, while a very high acceptance rate typically leads to autocorrelation between the values of the chain. Acceptance is documented for each iteration in the chains slot of objects of class MixModelBayes-class.

Value

An object of class MixModelBayes-class storing results, data, priors, initial values and information about convergence.

Author(s)

Martin Schaefer (martin.schaefer@udo.edu)

See Also

plotChains, MixModelBayes-class

Examples

set.seed(1000)
z <- c(rnorm(1000, 0, 0.5), rnorm(1000, 0, 1))
mm <- bayesMixModel(z, normNull=1:2, sdNormNullInit=c(0.1, 0.2),
  piInit=c(1/2, 1/2), shapeNorm0=c(1, 1), scaleNorm0=c(1, 1),
  shapeExpNeg0=c(), scaleExpNeg0=c(), 
  shapeExpPos0=c(), scaleExpPos0=c(), sdAlpha=1, itb=100, nmc=1000, thin=10)
mm
plotComponents(mm)
plotChains(mm, chain="pi")

z <- c(rnorm(200, 0, 1), rnorm(200, 0, 5), rexp(200, 0.1), -rexp(200, 0.1))
mm <- bayesMixModel(z, normNull=1:2, gamNeg=3, gamPos=4,
  sdNormNullInit=c(1, 1),
  shapeGamNegInit=1, scaleGamNegInit=1, shapeGamPosInit=1, scaleGamPosInit=1,
  shapeNorm0=c(1,3), scaleNorm0=c(1,3), sdAlpha=1,
  shapeGamNegAlpha0=1, shapeGamNegBeta0=1,
  scaleGamNegAlpha0=1, scaleGamNegBeta0=1,
  shapeGamPosAlpha0=1, shapeGamPosBeta0=1,
  scaleGamPosAlpha0=1, scaleGamPosBeta0=1, sdShape=0.025, 
  itb=100, nmc=1000, thin=10)
mm
plotComponents(mm)
plotChains(mm, chain="pi")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(epigenomix)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/epigenomix/bayesMixModel.Rd_%03d_medium.png", width=480, height=480)
> ### Name: bayesMixModel
> ### Title: Fits a Bayesian mixture model using Markov Chain Monte Carlo
> ###   (MCMC) methods
> ### Aliases: bayesMixModel bayesMixModel,numeric-method
> 
> ### ** Examples
> 
> set.seed(1000)
> z <- c(rnorm(1000, 0, 0.5), rnorm(1000, 0, 1))
> mm <- bayesMixModel(z, normNull=1:2, sdNormNullInit=c(0.1, 0.2),
+   piInit=c(1/2, 1/2), shapeNorm0=c(1, 1), scaleNorm0=c(1, 1),
+   shapeExpNeg0=c(), scaleExpNeg0=c(), 
+   shapeExpPos0=c(), scaleExpPos0=c(), sdAlpha=1, itb=100, nmc=1000, thin=10)
[1] "Iteration: 50"
[1] "Iteration: 100"
[1] "Iteration: 150"
[1] "Iteration: 200"
[1] "Iteration: 250"
[1] "Iteration: 300"
[1] "Iteration: 350"
[1] "Iteration: 400"
[1] "Iteration: 450"
[1] "Iteration: 500"
[1] "Iteration: 550"
[1] "Iteration: 600"
[1] "Iteration: 650"
[1] "Iteration: 700"
[1] "Iteration: 750"
[1] "Iteration: 800"
[1] "Iteration: 850"
[1] "Iteration: 900"
[1] "Iteration: 950"
[1] "Iteration: 1000"
[1] "Iteration: 1050"
[1] "Iteration: 1100"
> mm

MixModel object
    Number of data points:  2000
    Number of components:   2
        1: NormNull
             mean = 0
             sd = 0.5501342
           weight pi = 0.6329708
           classified data points: 1666
        2: NormNull
             mean = 0
             sd = 1.081251
           weight pi = 0.3670292
           classified data points: 334

> plotComponents(mm)
> plotChains(mm, chain="pi")
> 
> z <- c(rnorm(200, 0, 1), rnorm(200, 0, 5), rexp(200, 0.1), -rexp(200, 0.1))
> mm <- bayesMixModel(z, normNull=1:2, gamNeg=3, gamPos=4,
+   sdNormNullInit=c(1, 1),
+   shapeGamNegInit=1, scaleGamNegInit=1, shapeGamPosInit=1, scaleGamPosInit=1,
+   shapeNorm0=c(1,3), scaleNorm0=c(1,3), sdAlpha=1,
+   shapeGamNegAlpha0=1, shapeGamNegBeta0=1,
+   scaleGamNegAlpha0=1, scaleGamNegBeta0=1,
+   shapeGamPosAlpha0=1, shapeGamPosBeta0=1,
+   scaleGamPosAlpha0=1, scaleGamPosBeta0=1, sdShape=0.025, 
+   itb=100, nmc=1000, thin=10)
[1] "Iteration: 50"
[1] "Iteration: 100"
[1] "Iteration: 150"
[1] "Iteration: 200"
[1] "Iteration: 250"
[1] "Iteration: 300"
[1] "Iteration: 350"
[1] "Iteration: 400"
[1] "Iteration: 450"
[1] "Iteration: 500"
[1] "Iteration: 550"
[1] "Iteration: 600"
[1] "Iteration: 650"
[1] "Iteration: 700"
[1] "Iteration: 750"
[1] "Iteration: 800"
[1] "Iteration: 850"
[1] "Iteration: 900"
[1] "Iteration: 950"
[1] "Iteration: 1000"
[1] "Iteration: 1050"
[1] "Iteration: 1100"
> mm

MixModel object
    Number of data points:  800
    Number of components:   4
        1: GamNeg
             shape = 0.8763034
             scale = 10.08081
           weight pi = 0.3723671
           classified data points: 266
        2: NormNull
             mean = 0
             sd = 1.885232
           weight pi = 0.001049356
           classified data points: 4
        3: NormNull
             mean = 0
             sd = 0.7871551
           weight pi = 0.2194311
           classified data points: 224
        4: GamPos
             shape = 1.142207
             scale = 6.474989
           weight pi = 0.4071524
           classified data points: 306

> plotComponents(mm)
> plotChains(mm, chain="pi")
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>