Last data update: 2014.03.03

R: Gaussian RPMM Tree
glcTreeR Documentation

Gaussian RPMM Tree

Description

Performs Gaussian latent class modeling using recursively-partitioned mixture model

Usage

glcTree(x, initFunctions = list(glcInitializeSplitFanny(nu=1.5)), 
   weight = NULL, index = NULL, wthresh = 1e-08, 
   nodename = "root", maxlevel = Inf, verbose = 2, nthresh = 5, level = 0, 
   env = NULL, unsplit = NULL, splitCriterion = glcSplitCriterionBIC)

Arguments

x

Data matrix (n x j) on which to perform clustering. Missing values are supported.

initFunctions

List of functions of type “glcInitialize...” for initializing latent class model. See glcInitializeFanny for an example of arguments and return values.

weight

Weight corresponding to the indices passed (see index). Defaults to 1 for all indices

index

Row indices of data matrix to include. Defaults to all (1 to n).

wthresh

Weight threshold for filtering data to children. Indices having weight less than this value will not be passed to children nodes. Default=1E-8.

nodename

Name of object that will represent node in tree data object. Defaults to “root”. USER SHOULD NOT SET THIS.

maxlevel

Maximum depth to recurse. Default=Inf.

verbose

Level of verbosity. Default=2 (too much). 0 for quiet.

nthresh

Total weight in node required for node to be a candidate for splitting. Nodes with weight less than this value will never split. Defaults to 5.

level

Current level. Defaults to 0. USER SHUOLD NOT SET THIS.

env

Object of class “glcTree” to store tree data. Defaults to a new object. USER SHOULD NOT SET THIS.

unsplit

Latent class parameters from parent, to store in current node. Defaults to NULL for root. This is used in plotting functions. USER SHOULD NOT SET THIS.

splitCriterion

Function of type “glcSplitCriterion...” for determining whether a node should be split. See glcSplitCriterionBIC for an example of arguments and return values.

Details

This function is called recursively by itself. Upon each recursion, certain arguments (e.g. nodename) are reset. Do not attempt to set these arguments yourself.

Value

An object of class “glcTree”. This is an environment, each of whose component objects represents a node in the tree.

Note

The class “glcTree” is currently implemented as an environment object with nodes represented flatly, with name indicating positition in hierarchy (e.g. “rLLR” = “right child of left child of left child of root”) This implementation is to make certain plotting and update functions simpler than would be required if the data were stored in a more natural “list of list” format.

The following error may appear during the course of the algorithm:

      Error in optim(logab, betaObjf, ydata = y, wdata = w, weights = weights,  : 
           non-finite value supplied by optim
      

This is merely an indication that the node being split is too small, in which case the splitting will terminate at that node; in other words, it is nothing to worry about.

Author(s)

E. Andres Houseman

References

Houseman et al., Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions. BMC Bioinformatics 9:365, 2008.

See Also

blcTree

Examples


data(IlluminaMethylation)

## Not run: 
heatmap(IllumBeta, scale="n",
  col=colorRampPalette(c("yellow","black","blue"),space="Lab")(128))

## End(Not run)

# Fit Gaussian RPMM
rpmm <- glcTree(IllumBeta, verbose=0)
rpmm

# Get weight matrix and show first few rows
rpmmWeightMatrix <- glcTreeLeafMatrix(rpmm)
rpmmWeightMatrix[1:3,]

# Get class assignments and compare with tissue
rpmmClass <- glcTreeLeafClasses(rpmm)
table(rpmmClass,tissue)

## Not run: 
# Plot fit
par(mfrow=c(2,2))
plot(rpmm) ; title("Image of RPMM Profile")
plotTree.glcTree(rpmm) ; title("Dendrogram with Labels")
plotTree.glcTree(rpmm, 
  labelFunction=function(u,digits) table(as.character(tissue[u$index])))
title("Dendrogram with Tissue Counts")

# Alternate initialization
rpmm2 <- glcTree(IllumBeta, verbose=0, 
  initFunctions=list(glcInitializeSplitEigen(),
                     glcInitializeSplitFanny(nu=2.5)))
rpmm2

# Alternate split criterion
rpmm3 <- glcTree(IllumBeta, verbose=0, maxlev=3,
  splitCriterion=glcSplitCriterionLevelWtdBIC)
rpmm3

rpmm4 <- glcTree(IllumBeta, verbose=0, maxlev=3,
  splitCriterion=glcSplitCriterionJustRecordEverything)
rpmm4$rLL$splitInfo$llike1
rpmm4$rLL$splitInfo$llike2

## End(Not run)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(RPMM)
Loading required package: cluster
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/RPMM/glcTree.Rd_%03d_medium.png", width=480, height=480)
> ### Name: glcTree
> ### Title: Gaussian RPMM Tree
> ### Aliases: glcTree
> ### Keywords: tree cluster
> 
> ### ** Examples
> 
> 
> data(IlluminaMethylation)
> 
> ## Not run: 
> ##D heatmap(IllumBeta, scale="n",
> ##D   col=colorRampPalette(c("yellow","black","blue"),space="Lab")(128))
> ## End(Not run)
> 
> # Fit Gaussian RPMM
> rpmm <- glcTree(IllumBeta, verbose=0)
> rpmm
Recursively partitioned beta mixture model:	 73 nodes, 37 terminal nodes.
> 
> # Get weight matrix and show first few rows
> rpmmWeightMatrix <- glcTreeLeafMatrix(rpmm)
> rpmmWeightMatrix[1:3,]
     rLLLLLLL rLLLLLLR rLLLLLRLL rLLLLLRLR rLLLLLRR rLLLLRL rLLLLRR rLLLRLL
[1,]        0        1         0         0        0       0       0       0
[2,]        0        0         1         0        0       0       0       0
[3,]        0        0         0         0        0       0       0       0
     rLLLRLR rLLLRRL rLLLRRR rLLRLL rLLRLR rLLRRL rLLRRRL rLLRRRR rLRLLLL
[1,]       0       0       0      0      0      0       0       0       0
[2,]       0       0       0      0      0      0       0       0       0
[3,]       0       0       0      1      0      0       0       0       0
     rLRLLLR rLRLLRLL rLRLLRLR rLRLLRR rLRLRLL rLRLRLR rLRLRRLLL rLRLRRLLR
[1,]       0        0        0       0       0       0         0         0
[2,]       0        0        0       0       0       0         0         0
[3,]       0        0        0       0       0       0         0         0
     rLRLRRLR rLRLRRR rLRRL rLRRR rRL rRRLL rRRLRLL rRRLRLR rRRLRR rRRRLL
[1,]        0       0     0     0   0     0       0       0      0      0
[2,]        0       0     0     0   0     0       0       0      0      0
[3,]        0       0     0     0   0     0       0       0      0      0
     rRRRLR rRRRR
[1,]      0     0
[2,]      0     0
[3,]      0     0
> 
> # Get class assignments and compare with tissue
> rpmmClass <- glcTreeLeafClasses(rpmm)
> table(rpmmClass,tissue)
           tissue
rpmmClass   bladder blood brain cervical H & N kidney lung placenta pleura
  rLLLLLLL        0     0     0        0     0      0    2        0      0
  rLLLLLLR        0     0     0        0     0      0    4        0      0
  rLLLLLRLL       0     0     0        0     0      0    3        0      0
  rLLLLLRLR       0     0     0        0     0      0    2        0      0
  rLLLLLRR        0     0     0        0     0      0    2        0      0
  rLLLLRL         0     0     0        0     1      0    2        0      0
  rLLLLRR         0     0     0        0     0      0    4        0      0
  rLLLRLL         0     0     0        1     2      0    1        0      1
  rLLLRLR         3     0     0        0     0      0    0        0      0
  rLLLRRL         1     0     0        0     0      0    0        0      0
  rLLLRRR         0     0     0        0     0      0    0        0      3
  rLLRLL          0     0     0        0     0      0    8        0      4
  rLLRLR          0     0     0        0     0      0    2        0      5
  rLLRRL          0     0     0        0     0      0   15        0      2
  rLLRRRL         0     0     0        0     0      0    2        0      0
  rLLRRRR         0     0     0        0     0      0    4        0      0
  rLRLLLL         0     0     0        0     0      0    2        0      1
  rLRLLLR         0     0     1        0     0      0    0        0      2
  rLRLLRLL        0     0     0        0     3      0    0        0      0
  rLRLLRLR        0     0     0        0     2      0    0        0      0
  rLRLLRR         0     0     0        2     2      0    0        0      0
  rLRLRLL         1     0     1        0     1      1    0        0      0
  rLRLRLR         0     0     0        0     0      5    0        0      0
  rLRLRRLLL       0     0     3        0     0      0    0        0      0
  rLRLRRLLR       0     0     3        0     0      0    0        0      0
  rLRLRRLR        0     0     1        0     0      0    0        0      0
  rLRLRRR         0     0     3        0     0      0    0        0      0
  rLRRL           0     0     0        0     0      0    0        5      0
  rLRRR           0     0     0        0     0      0    0       14      0
  rRL             0    53     0        0     0      0    0        0      0
  rRRLL           0     2     0        0     0      0    0        0      0
  rRRLRLL         0     5     0        0     0      0    0        0      0
  rRRLRLR         0     2     0        0     0      0    0        0      0
  rRRLRR          0     4     0        0     0      0    0        0      0
  rRRRLL          0     3     0        0     0      0    0        0      0
  rRRRLR          0     3     0        0     0      0    0        0      0
  rRRRR           0    13     0        0     0      0    0        0      0
           tissue
rpmmClass   sm intestine
  rLLLLLLL             0
  rLLLLLLR             0
  rLLLLLRLL            0
  rLLLLLRLR            0
  rLLLLLRR             0
  rLLLLRL              0
  rLLLLRR              0
  rLLLRLL              0
  rLLLRLR              2
  rLLLRRL              2
  rLLLRRR              0
  rLLRLL               0
  rLLRLR               0
  rLLRRL               0
  rLLRRRL              0
  rLLRRRR              0
  rLRLLLL              1
  rLRLLLR              0
  rLRLLRLL             0
  rLRLLRLR             0
  rLRLLRR              0
  rLRLRLL              0
  rLRLRLR              0
  rLRLRRLLL            0
  rLRLRRLLR            0
  rLRLRRLR             0
  rLRLRRR              0
  rLRRL                0
  rLRRR                0
  rRL                  0
  rRRLL                0
  rRRLRLL              0
  rRRLRLR              0
  rRRLRR               0
  rRRRLL               0
  rRRRLR               0
  rRRRR                0
> 
> ## Not run: 
> ##D # Plot fit
> ##D par(mfrow=c(2,2))
> ##D plot(rpmm) ; title("Image of RPMM Profile")
> ##D plotTree.glcTree(rpmm) ; title("Dendrogram with Labels")
> ##D plotTree.glcTree(rpmm, 
> ##D   labelFunction=function(u,digits) table(as.character(tissue[u$index])))
> ##D title("Dendrogram with Tissue Counts")
> ##D 
> ##D # Alternate initialization
> ##D rpmm2 <- glcTree(IllumBeta, verbose=0, 
> ##D   initFunctions=list(glcInitializeSplitEigen(),
> ##D                      glcInitializeSplitFanny(nu=2.5)))
> ##D rpmm2
> ##D 
> ##D # Alternate split criterion
> ##D rpmm3 <- glcTree(IllumBeta, verbose=0, maxlev=3,
> ##D   splitCriterion=glcSplitCriterionLevelWtdBIC)
> ##D rpmm3
> ##D 
> ##D rpmm4 <- glcTree(IllumBeta, verbose=0, maxlev=3,
> ##D   splitCriterion=glcSplitCriterionJustRecordEverything)
> ##D rpmm4$rLL$splitInfo$llike1
> ##D rpmm4$rLL$splitInfo$llike2
> ## End(Not run)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>