gowdis measures the Gower (1971) dissimilarity for mixed variables, including asymmetric binary variables. Variable weights can be specified. gowdis implements Podani's (1999) extension to ordinal variables.
Usage
gowdis(x, w, asym.bin = NULL, ord = c("podani", "metric", "classic"))
Arguments
x
matrix or data frame containing the variables. Variables can be numeric, ordered, or factor. Symmetric or asymmetric binary variables should be numeric and only contain 0 and 1. character variables will be converted to factor. NAs are tolerated.
w
vector listing the weights for the variables in x. Can be missing, in which case all variables have equal weights.
asym.bin
vector listing the asymmetric binary variables in x.
ord
character string specifying the method to be used for ordinal variables (i.e. ordered). "podani" refers to Eqs. 2a-b of Podani (1999), while "metric" refers to his Eq. 3 (see ‘details’); both options convert ordinal variables to ranks. "classic" simply treats ordinal variables as continuous variables. Can be abbreviated.
Details
gowdis computes the Gower (1971) similarity coefficient exactly as described by Podani (1999), then converts it to a dissimilarity coefficient by using D = 1 - S. It integrates variable weights as described by Legendre and Legendre (1998).
Let X = {Xij} be a matrix containing n objects (rows) and m columns (variables). The similarity Gjk between objects j and k is computed as
Gjk = sum(Wijk * Sijk) / sum(Wijk)
,
where Wijk is the weight of variable i for the j-k pair, and Sijk is the partial similarity of variable i for the j-k pair,
and where Wijk = 0 if objects j and k cannot be compared because Xij or Xik is unknown (i.e. NA).
For binary variables, Sijk = 0 if Xij is not equal to Xik, and Sijk = 1 if Xij = Xik = 1 or if Xij = Xik = 0.
For asymmetric binary variables, same as above except that Wijk = 0 if Xij = Xik = 0.
For nominal variables, Sijk = 0 if Xij is not equal to Xik and Sijk = 1 if Xij = Xik.
For continuous variables,
Sijk = 1 - [ |Xij - Xik| / (Xi.max - Xi.min) ]
where Xi.max and Xi.min are the maximum and minimum values of variable i, respectively.
For ordinal variables, when ord = "podani" or ord = "metric", all Xij are replaced by their ranks Rij determined over all objects (such that ties are also considered), and then
where Tij is the number of objects which have the same rank score for variable i as object j (including j itself), Tik is the number of objects which have the same rank score for variable i as object k (including k itself), Ri.max and Ri.min are the maximum and minimum ranks for variable i, respectively, Ti.max is the number of objects with the maximum rank, and Ti.min is the number of objects with the minimum rank.
if ord = "metric"
Sijk = 1 - [ |Rij - Rik| / (Ri.max - Ri.min) ]
When ord = "classic", ordinal variables are simply treated as continuous variables.
Value
an object of class dist with the following attributes: Labels, Types (the variable types, where 'C' is continuous/numeric, 'O' is ordinal, 'B' is symmetric binary, 'A' is asymmetric binary, and 'N' is nominal), Size, Metric.
Gower, J. C. (1971) A general coefficient of similarity and some of its properties. Biometrics27:857-871.
Legendre, P. and L. Legendre (1998) Numerical Ecology. 2nd English edition. Amsterdam: Elsevier.
Podani, J. (1999) Extending Gower's general coefficient of similarity to ordinal characters. Taxon48:331-340.
See Also
daisy is similar but less flexible, since it does not include variable weights and does not treat ordinal variables as described by Podani (1999). Using ord = "classic" reproduces the behaviour of daisy.
Examples
ex1 <- gowdis(dummy$trait)
ex1
# check attributes
attributes(ex1)
# to include weights
w <- c(4,3,5,1,2,8,3,6)
ex2 <- gowdis(dummy$trait, w)
ex2
# variable 7 as asymmetric binary
ex3 <- gowdis(dummy$trait, asym.bin = 7)
ex3
# example with trait data from New Zealand vascular plant species
ex4 <- gowdis(tussock$trait)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(FD)
Loading required package: ade4
Loading required package: ape
Loading required package: geometry
Loading required package: magic
Loading required package: abind
Loading required package: vegan
Loading required package: permute
Loading required package: lattice
This is vegan 2.4-0
Attaching package: 'vegan'
The following object is masked from 'package:ade4':
cca
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/FD/gowdis.Rd_%03d_medium.png", width=480, height=480)
> ### Name: gowdis
> ### Title: Gower Dissimilarity
> ### Aliases: gowdis
> ### Keywords: multivariate
>
> ### ** Examples
>
> ex1 <- gowdis(dummy$trait)
> ex1
sp1 sp2 sp3 sp4 sp5 sp6 sp7
sp2 0.2181884
sp3 0.5240052 0.6678082
sp4 0.6737443 0.5610028 0.8225701
sp5 0.5291113 0.8145699 0.4862253 0.4843264
sp6 0.6100161 0.5932587 0.2784736 0.7073925 0.6067323
sp7 0.4484235 0.6863374 0.4848663 0.5575126 0.3023416 0.6187844
sp8 0.4072834 0.2039443 0.5958904 0.2390962 0.5585525 0.4470207 0.7030186
>
> # check attributes
> attributes(ex1)
$Labels
[1] "sp1" "sp2" "sp3" "sp4" "sp5" "sp6" "sp7" "sp8"
$Size
[1] 8
$Metric
[1] "Gower"
$Types
[1] "C" "C" "N" "N" "O" "O" "B" "B"
$class
[1] "dist"
>
> # to include weights
> w <- c(4,3,5,1,2,8,3,6)
> ex2 <- gowdis(dummy$trait, w)
> ex2
sp1 sp2 sp3 sp4 sp5 sp6 sp7
sp2 0.1190154
sp3 0.4156230 0.5584826
sp4 0.7157249 0.7541962 0.7478800
sp5 0.6538987 0.8231658 0.3994155 0.3880160
sp6 0.5074762 0.4422926 0.2767123 0.7753720 0.6317876
sp7 0.5006495 0.6116622 0.4934116 0.4642192 0.3773199 0.5997205
sp8 0.2813567 0.2730468 0.3359142 0.3658275 0.4977458 0.3645410 0.6129055
>
> # variable 7 as asymmetric binary
> ex3 <- gowdis(dummy$trait, asym.bin = 7)
> ex3
sp1 sp2 sp3 sp4 sp5 sp6 sp7
sp2 0.2545531
sp3 0.5240052 0.6678082
sp4 0.7699935 0.6545032 0.8225701
sp5 0.5291113 0.8145699 0.4862253 0.4843264
sp6 0.6100161 0.5932587 0.2784736 0.7073925 0.6067323
sp7 0.4484235 0.6863374 0.4848663 0.5575126 0.3023416 0.6187844
sp8 0.4751640 0.2447331 0.5958904 0.2789456 0.5585525 0.4470207 0.7030186
>
> # example with trait data from New Zealand vascular plant species
> ex4 <- gowdis(tussock$trait)
>
>
>
>
>
> dev.off()
null device
1
>