R: Transform data into a set of linguistic fuzzy attributes
lcut
R Documentation
Transform data into a set of linguistic fuzzy attributes
Description
This function creates a set of linguistic fuzzy attributes from crisp data. Numeric vectors,
matrix or data frame columns are transformed into a set of fuzzy attributes, i.e. columns with
membership degrees. Factors and other data types are transformed to fuzzy attributes by calling
the fcut function.
Usage
lcut3(x, ...)
## S3 method for class 'matrix'
lcut3(x, ...)
## S3 method for class 'data.frame'
lcut3(x,
context=NULL,
name=NULL,
parallel=FALSE,
...)
## S3 method for class 'numeric'
lcut3(x,
context=NULL,
defaultCenter=0.5,
atomic=c("sm", "me", "bi"),
hedges=c("ex", "si", "ve", "ml", "ro", "qr", "vr"),
name=NULL,
parallel=FALSE,
...)
lcut5(x, ...)
## S3 method for class 'matrix'
lcut5(x, ...)
## S3 method for class 'data.frame'
lcut5(x,
context=NULL,
name=NULL,
parallel=FALSE,
...)
## S3 method for class 'numeric'
lcut5(x,
context=NULL,
defaultCenter=0.5,
atomic=c('sm', 'lm', 'me', 'um', 'bi'),
hedges=c("ex", "ve", "ml", "ro", "ty"),
name=NULL,
parallel=FALSE,
...)
Arguments
x
Data to be transformed: if it is a numeric vector, matrix, or data frame, then the
creation of linguistic fuzzy attributes takes place. For other data types the
fcut function is called.
context
A definition of context of a numeric attribute. Context determines how people
understand the notions "small", "medium", or "big" with respect to that attribute.
If x is a numeric vector then context should be a vector of 3 numbers:
typical small, medium, and big value. If the context is set to NULL, these values
are taken directly from x as follows:
If x is a matrix or data frame then context should be a named list
of contexts for each x's column. If some context is omitted, it will be determined
directly from data as explained above.
Regardless of the value of the atomic argument, all 3 numbers of the context
must be provided everytime.
defaultCenter
A value used to determine a typical "medium" value from data (see context
above). If context is not specified then typical "medium" is determined as
(max(x) - min(x)) * defaultCenter + min(x).
Default value of defaultCenter is 0.5, however, some literature specifies
0.42 as another sensible value with proper linguistic interpretation.
atomic
A vector of atomic linguistic expressions to be used for creation of fuzzy attributes. The
possible values for lcut3 are:
smsmall;
memedium;
bibig.
For lcut5, the following values are possible:
smsmall;
lmlower medium;
memedium;
umupper medium;
bibig.
Several values are allowed in this argument.
hedges
A vector of linguistic hedges to be used for creation of fuzzy attributes.
For lcut3 variant, the following hedges are allowed:
exextremely (sm, bi);
sisignificantly (sm, bi);
vevery (sm, bi);
mlmore or less (sm, me, bi);
roroughly (sm, me, bi);
qrquite roughly (sm, me, bi);
vrvery roughly (sm, me, bi).
For lcut5 variant, the following hedges are allowed:
exextremely (sm, bi);
vevery (sm, bi);
mlmore or less (sm, me, bi);
roroughly (sm, me, bi);
tytypically (me).
By default, a fuzzy attribute is created for each atomic expression (i.e. "small", "medium",
"big") with empty hedge. Additionally, another fuzzy attributes are created based on the set of
hedges selected with this argument. Not all hedges are usable to any atomic expression. In
the list above, one can find the allowed atomic expressions in parentheses.
name
A name to be added as a suffix to the created fuzzy attribute names. This parameter
can be used only if x is a numeric vector. If x is a matrix or data frame,
name should be NULL because the fuzzy attribute names are taken from column names
of parameter x.
parallel
Whether the processing should be run in parallel or not. Parallelization is
implemented using the foreach package. The parallel environment must be
set properly in advance, e.g. with the registerDoMC function.
...
Other parameters to some methods.
Details
The aim of this function is to transform numeric data into a set of fuzzy attributes.
The resulting fuzzy attributes have direct
linguistic interpretation. This is a unique variant of fuzzification that is suitable for
the inference mechanism based on Perception-based Linguistic Description (PbLD) – see
pbld.
A numeric vector is transformed into a set of fuzzy attributes accordingly to the following
scheme:
<hedge> <atomic expression>
where <atomic expression> is a linguistic expression "small" ("sm"), "lower medium"
("lm"), "medium" ("me"), "upper medium" ("um") or
"big" ("bi") – see the atomic argument. A <hedge> is a modifier that further
concretizes the atomic expression. It can be empty ("") or some value of:
tytypically;
exextremely;
sisignificantly;
vevery;
mlmore or less;
roroughly;
qrquite roughly;
vrvery roughly.
Accordingly to the theory developed by Novak (2008), not every hedge is suitable with each atomic
expression (see the description of the hedges argument).
The hedges to be used can be selected with the hedges argument. Function takes care of
not to use hedge together with an un-applicable atomic expression by itself.
Obviously, distinct data have different meaning of what is "small", "medium", or "big".
Therefore, a context has to be set that specifies sensible values for these linguistic
expressions.
If a matrix (resp. data frame) is provided to this function instead of single vector, all columns are
processed the same way.
The function also sets up properly the vars and specs properties of
the result.
Value
An object of class "fsets" is returned, which is a numeric matrix with columns representing the
fuzzy attributes. Each source columm
of the x argument corresponds to multiple columns in the resulting matrix.
Columns will have names derived from used hedges, atomic expression, and name
specified as the optional parameter.
The resulting object would also have set the vars and specs
properties with the former being created from original column names (if x is a matrix or
data frame) or the name argument (if x is a numeric vector). The
specs incidency matrix would be created to reflect the following order of the
hedges: "ex" < "si" < "ve" < "" < "ml" < "ro" < "qr" < "vr" and "ty" < "".
Fuzzy attributes created from
the same source numeric vector (or column) would be ordered that way, with other fuzzy
attributes (from the other source) being incomparable.
Author(s)
Michal Burda
References
V. Novak, A comprehensive theory of trichotomous evaluative linguistic expressions, Fuzzy Sets
and Systems 159 (22) (2008) 2939–2969.
See Also
fcut,
farules,
pbldvars,
specs,
cbind.fsets
Examples
# transform a single vector
x <- runif(10)
lcut3(x, name='age')
lcut5(x, name='age')
# transform single vector with custom context
lcut3(x, context=c(0, 0.2, 0.5), name='age')
lcut5(x, context=c(0, 0.2, 0.5), name='age')
# transform all columns of a data frame
# and do not use any hedges
data <- CO2[, c('conc', 'uptake')]
lcut3(data, hedges=NULL)
lcut5(data, hedges=NULL)
# definition of custom contexts for different columns
# of a data frame while selecting only "ve" and "ro" hedges.
lcut3(data,
context=list(conc=c(0, 500, 1000),
uptake=c(0, 25, 50)),
hedges=c('ve', 'ro'))
# lcut on non-numeric data is the same as fcut()
ff <- factor(substring("statistics", 1:10, 1:10), levels = letters)
lcut3(ff)
lcut5(ff)