Formatted TitanCNA results output from outputTitanResults. See Example.
centroid.method
median or mean method to compute cluster centroids during internal cluster validation.
data.type
Compute S_Dbw validity index based on copy number (use ‘LogRatio’) or allelic ratio (use ‘AllelicRatio’).
symmetric
TRUE if the TITAN analysis was carried out using symmetric genotypes. See loadAlleleCounts.
S_Dbw.method
Compute S_Dbw validity index using Halkidi or Tong method. See details and references.
Details
S_Dbw Validity Index is an internal clustering evaluation that is used for model selection (Halkidi et al. 2002). It attempts to choose the model that minimizes within cluster variances (scat) and maximizes density-based cluster separation (Dens). Then, S_Dbw(|c_T|x z)=Dens(|c_T|x z)+scat(|c_T|x z).
In the context of TitanCNA, if data.type=‘LogRatio’, then the S_Dbw internal data consists of copy number log ratios, and the resulting joint states of copy number (c_T, forall c_T in {0 : max.copy.number}) and clonal cluster (z) make up the clusters in the internal evaluation. If data.type=‘AllelicRatio’, then the S_Dbw internal data consists of the allelic ratios. The optimal TitanCNA run is chosen as the run with the minimum S_Dbw. If data.type=‘Both’, then the sum of the S_Dbw for ‘LogRatio’ and ‘AllelicRatio’ are added together. This helps account for both data types when choosing the optimal solution.
Note that for S_Dbw.method, the Tong method has an incorrect formulation of the scat(c) function. The function should be a weighted sum, but that is not the formulation shown in the publication. computeSDbwIndex uses (ni/N) instead of (N-ni)/N in the scat formula, where ni is the number of datapoints in cluster i and N is the total number of datapoints.
Value
list with components:
dens.bw
density component of S_Dbw index
scat
scatter component of S_Dbw index
S_DbwIndex
Sum of dens.bw and scat.
Author(s)
Gavin Ha <gavinha@gmail.com>
References
Halkidi, M., Batistakis, Y., and Vazirgiannis, M. (2002). Clustering validity checking methods: part ii. SIGMOD Rec., 31(3):19–27.
Tong, J. and Tan, H. Clustering validity based on the improved S_Dbw* index. (2009). Journal of Electronics (China), Volume 26, Issue 2, pp 258-264.
Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., Biele, J., Ding, J., Le, A., Rosner, J., Shumansky, K., Marra, M. A., Huntsman, D. G., McAlpine, J. N., Aparicio, S. A. J. R., and Shah, S. P. (2014). TITAN: Inference of copy number architectures in clonal cell populations from tumour whole genome sequence data. Genome Research, 24: 1881-1893. (PMID: 25060187)
See Also
outputModelParameters, loadAlleleCounts
Examples
data(EMresults)
#### COMPUTE OPTIMAL STATE PATH USING VITERBI ####
#options(cores=1)
optimalPath <- viterbiClonalCN(data, convergeParams)
#### FORMAT RESULTS ####
results <- outputTitanResults(data, convergeParams, optimalPath,
filename = NULL, posteriorProbs = FALSE)
#### COMPUTE S_Dbw Validity Index FOR MODEL SELECTION ####
s_dbw <- computeSDbwIndex(results, data.type = "LogRatio",
centroid.method = "median", S_Dbw.method = "Tong")
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(TitanCNA)
Loading required package: foreach
Loading required package: IRanges
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/TitanCNA/computeSDbwIndex.Rd_%03d_medium.png", width=480, height=480)
> ### Name: computeSDbwIndex
> ### Title: Compute the S_Dbw Validity Index for 'TitanCNA' model selection
> ### Aliases: computeSDbwIndex
> ### Keywords: manip
>
> ### ** Examples
>
> data(EMresults)
>
> #### COMPUTE OPTIMAL STATE PATH USING VITERBI ####
> #options(cores=1)
> optimalPath <- viterbiClonalCN(data, convergeParams)
Warning message:
executing %dopar% sequentially: no parallel backend registered
>
> #### FORMAT RESULTS ####
> results <- outputTitanResults(data, convergeParams, optimalPath,
+ filename = NULL, posteriorProbs = FALSE)
>
> #### COMPUTE S_Dbw Validity Index FOR MODEL SELECTION ####
> s_dbw <- computeSDbwIndex(results, data.type = "LogRatio",
+ centroid.method = "median", S_Dbw.method = "Tong")
>
>
>
>
>
> dev.off()
null device
1
>