The matrix of intensity measurements. The
row names must be consistent with the Individual ID in
fam file.
N
Number of clusters one wants to fit to the data.
N needs to be larger than 1 and if it is 1, error will be
returned. The default value 2,3,...,6 will be used if it
is missing.
varSelection
Factor. For specifying how to handle
the intensity values. It must take value on 'RAW',
'PC.9', 'PC1'and 'MEAN'. If the value is 'RAW', then the
raw intensity value will be used. If it is 'PC.9', then
the first several PCA scores which account for 90% of
all the variance will be used. If the value is 'PC1',
then the first PCA scores will be used. If the value is
'MEAN', the mean of all the probes will be used. The
default method is 'PC1'.
threshold
Optional number of convergence
threshold. The iteration stops if the absolute difference
of log likelihood between successive iterations is less
than it. The default threshold 1e-05 will be used if it's
missing.
itermax
Optional. The iteration stops if the time
of iteration is large than this value. The default number
8 will be used if it's missing.
adjust
Logicals, If TRUE (default), the result
will be adjusted by the silhouette score. See details.
thresMAF
The minor allele frequency threshold.
thresSil
The abandon threshold. The individual
whose silhouette score is smaller than this value will be
abandoned.
scale
Logicals. If TRUE, the signal will be scale
by using sample mean and sample variance by columns
before further data-processing.
Details
adjustIf adjust is TRUE, the result
will be adjusted by the silhouette score in the following
criterion. For each individual, the silhouette scores are
calculated for each group. The individual will assigned
forcefully to the group which maximize the silhouette
scores.
Value
It returns object of class 'clust'. 'clust' is a list
containing following components:
clusNum
The
optimal number of clusters among give parameter N.
silWidth
Silhouette related results.
Author(s)
Meiling Liu
Examples
# Fit the data under the given clustering numbers
clus.fit <- ClusProc(signal=signal,N=2:6,varSelection='PC.9')
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(PedCNV)
Loading required package: Rcpp
Loading required package: RcppArmadillo
Loading required package: ggplot2
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/PedCNV/ClusProc.Rd_%03d_medium.png", width=480, height=480)
> ### Name: ClusProc
> ### Title: CNV clustering Procedure
> ### Aliases: ClusProc
>
> ### ** Examples
>
> # Fit the data under the given clustering numbers
> clus.fit <- ClusProc(signal=signal,N=2:6,varSelection='PC.9')
The first 5 principal components are used.
The logliklihood for signal model is -1663.629 when clustering number is 2.
The logliklihood for signal model is -1477.954 when clustering number is 3.
The logliklihood for signal model is -1394.682 when clustering number is 4.
The logliklihood for signal model is -1338.013 when clustering number is 5.
The logliklihood for signal model is -1283.297 when clustering number is 6.
>
>
>
>
>
> dev.off()
null device
1
>