coordinates for the background map; see details below
onlyout
if TRUE only the outliers are shown in the plot
bw
if TRUE symbold will be in grey scale rather than in color
symb
if TRUE special symbols are used according to outlyingness
symbtxt
if TRUE text labels are used for plotting
col
define colors to be used for outliers and non-outliers
pch
define plotting symbols to be used for outliers and non-outliers
obj.cex
define symbol size for outliers and non-outliers
transp
define transparancy for parallel coordinate plot
Details
The function mvoutlier.CoDa prepares the information needed for this plot function:
In a first step, the raw compositional data set in transformed by the isometric logratio
(ilr) transformation to the usual Euclidean space. Then adaptive outlier detection is
perfomed: Starting from a quantile 1-alpha of the chisquare distribution, one looks for the
supremum of the differences between the chisquare distribution and the empirical distribution
of the squared Mahalanobis distances. The latter are derived from the MCD estimator using
the proportion quan of the data. The supremum is the outlier cutoff, and certain colors
and symbols for the outliers are computed: The colors should reflect the magnitude of the
median element concentration of the observations, which is done by computing for each observation
along the single ilr variables the distances to the medians. The mediab of all distances
determines the color (or grey scale): a high value, resulting in a red (or dark) symbol,
means that most univariate parts have higher values than the average, and a low value (blue
or light symbol) refers to an observation with mainly low values. The symbols are according
to the cut-points from the quantiles 0.25, 0.5, 0.75, and the outlier cutoff of the
squared Mahalanobis distances. This plot function then allows to visualize the information.
The optional background map for the representation of the outliers in a map can be
included using the argument map. This should consist of one or more polygons
representing the geographical x- and y-coordinates of the background map. Of course,
this map should be represented in the same coordinate system as the coordinates for
the sample locations provided by coord. The structure of map is as follows:
It should consist of 2 columns, one for the x-, one for the y-coordinates. If a polygon
ends, a row with 2 entries NA should follow. At the end two NA rows are needed.
See also examples below.
P. Filzmoser, K. Hron, and C. Reimann.
Interpretation of multivariate outliers for compositional data.
Submitted to Computers and Geosciences.
See Also
mvoutlier.CoDa, arw, map.plot, uni.plot
Examples
data(humus)
el=c("As","Cd","Co","Cu","Mg","Pb","Zn")
dsel <- humus[,el]
data(kola.background) # contains different information (coast, borders, etc.)
coo <- rbind(kola.background$coast,kola.background$boundary,kola.background$borders)
XY <- humus[,c("XCOO","YCOO")]
set.seed(123)
res <- mvoutlier.CoDa(dsel)
par(ask=TRUE)
### Parallel coordinate plot:
## show for all obvervations (transp is only useful when generating e.g. a pdf):
# plot(res,onlyout=FALSE,bw=TRUE,which="parallel",symb=FALSE,symbtxt=FALSE,transp=0.3)
## show only outliers with special colors and labels in the margins:
plot(res,onlyout=TRUE,bw=FALSE,which="parallel",symb=TRUE,symbtxt=TRUE,transp=0.3)
### Biplot:
## show all data points, outliers are in different color and have different symbol:
# plot(res,onlyout=FALSE,which="biplot",bw=FALSE,symb=FALSE,symbtxt=FALSE)
## show only the outliers with special symbols and colors:
plot(res,onlyout=TRUE,which="biplot",bw=FALSE,symb=TRUE,symbtxt=TRUE)
### Map:
## show all data points, outliers are in different color and have different symbol:
# plot(res,coord=XY,map=coo,onlyout=FALSE,which="map",bw=FALSE,symb=FALSE,symbtxt=FALSE)
## show only the outliers with special symbols and colors:
plot(res,coord=XY,map=coo,onlyout=TRUE,which="map",bw=FALSE,symb=TRUE,symbtxt=TRUE)
### Univariate scatterplot:
## show all data points, outliers are in different color and have different symbol:
# plot(res,onlyout=FALSE,which="uni",symb=FALSE,symbtxt=FALSE)
## show only the outliers with special symbols and colors:
plot(res,onlyout=TRUE,which="uni",symb=TRUE,symbtxt=TRUE)