R: Calculate Dissimilarity Matrix for Mixed Attributes.
calcDissimMat
R Documentation
Calculate Dissimilarity Matrix for Mixed Attributes.
Description
Takes in two data frames where first contains only qualitative attributes and the other
contains only quantitative attributes. Function calculates the dissimilarity matrix
based on the method proposed by Ahmad & Dey (2007).
Usage
calcDissimMat(myDataQuali, myDataQuant)
Arguments
myDataQuali
A data frame which includes only qualitative variables in columns.
myDataQuant
A data frame which includes only quantitative variables in columns.
Details
calcDissimMat is an implementtion of the method proposed by Ahmad & Dey (2007)
to calculate the dissimilarity matrix at the presence of both qualitative and quantitative
attributes. This approach finds dissimilarity of qualitative and quantitative attributes seperately
and the final dissimilarity matrix is formed by combining both. See Ahmad & Dey (2007) for
more datails.
Value
A dissimilarity matrix. This can be used as an input to pam, fanny, agnes and diana functions.
References
Ahmad, A., & Dey, L. (2007). A k-mean clustering algorithm for mixed numeric and categorical data. Data & Knowledge Engineering, 63(2), 503-527.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(DisimForMixed)
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/DisimForMixed/calcDissimMat.Rd_%03d_medium.png", width=480, height=480)
> ### Name: calcDissimMat
> ### Title: Calculate Dissimilarity Matrix for Mixed Attributes.
> ### Aliases: calcDissimMat
>
> ### ** Examples
>
> QualiVars <- data.frame(Qlvar1 = c("A","B","A","C","C","A"), Qlvar2 = c("Q","Q","R","Q","R","Q"))
> QuantVars <- data.frame(Qnvar1 = c(1.5,3.2,4.9,5,2.8,3.1), Qnvar2 = c(4.8,2,1.1,5.8,3.1,2.2))
> DisSimMatCalcd <- calcDissimMat(QualiVars, QuantVars)
>
> agnesClustering <- cluster::agnes(DisSimMatCalcd, diss = TRUE, method = "ward")
> silWidths <- cluster::silhouette(cutree(agnesClustering, k = 2), DisSimMatCalcd)
> mean(silWidths[,3])
[1] 0.641198
> plot(agnesClustering)
>
> PAMClustering <- cluster::pam(DisSimMatCalcd, k=2, diss = TRUE)
> silWidths <- cluster::silhouette(PAMClustering, DisSimMatCalcd)
> plot(silWidths)
>
>
>
>
>
> dev.off()
null device
1
>