Gaussianize is probably the most useful function in this package. It
works the same way as scale, but instead of just
centering and scaling the data, it actually Gaussianizes the data
(works well for unimodal data). See Goerg (2011, 2016) and Examples.
Important: For multivariate input X it performs a column-wise
Gaussianization (by simply calling apply(X, 2, Gaussianize)),
which is only a marginal Gaussianization. This does not mean (and
is in general definitely not the case) that the transformed data is then
jointly Gaussian.
By default Gaussianize returns the X sim N(μ_x, σ_x^2)
input, not the zero-mean, unit-variance U sim N(0, 1) input. Use
return.u = TRUE to obtain U.
a numeric matrix-like object; either the data that should be
Gaussianized; or the data that should ”DeGaussianized” (inverse =
TRUE), i.e., converted back to the original space.
type
what type of non-normality: symmetric heavy-tails "h"
(default), skewed heavy-tails "hh", or just skewed "s".
method
what estimator should be used: "MLE" or "IGMM".
"IGMM" gives exactly Gaussian characteristics (kurtosis
equiv 3 for "h" or skewness equiv 0 for "s"),
"MLE" comes close to this. Default: "IGMM" since it is much
faster than "MLE".
return.tau.mat
logical; if TRUE it also returns the estimated
τ parameters as a matrix (same number of columns as
data). This matrix can then be used to Gaussianize new
data with pre-estimated τ. It can also be used to
“DeGaussianize” data by passing it as an argument (tau.mat) to
Gaussianize() and set inverse = TRUE.
inverse
logical; if TRUE it performs the inverse transformation
using tau.mat to "DeGaussianize" the data back to the original
space again.
tau.mat
instead of estimating τ from the data you can pass it
as a matrix (usually obtained via Gaussianize(..., return.tau.mat =
TRUE)). If inverse = TRUE it uses this tau matrix to
“DeGaussianize” the data again. This is useful to back-transform new
data in the Gaussianized space, e.g., predictions or fits, back to the
original space.
verbose
logical; if TRUE, it prints out progress information in
the console. Default: FALSE.
return.u
logical; if TRUE it returns the zero-mean, unit
variance Gaussian input. If FALSE (default) it returns the input
X.
input.u
optional; if you used return.u = TRUE in a previous
step, and now you want to convert the data back to original space, then
you have to pass it as input.u. If you pass numeric data as
data, Gaussianize assumes that data is the input
corresponding to X, not U.
Value
numeric matrix-like object with same dimension/size as input data.
If inverse = FALSE it is the Gaussianize matrix / vector;
if TRUE it is the “DeGaussianized” matrix / vector.
The numeric parameters of mean, scale, and skewness/heavy-tail parameters
that were used in the Gaussianizing transformation are returned as
attributes of the output matrix: 'Gaussianized:mu',
'Gaussianized:sigma', and for
type = "h":
'Gaussianized:delta' & 'Gaussianized:alpha',
type = "hh":
'Gaussianized:delta_l' and 'Gaussianized:delta_r' &
'Gaussianized:alpha_l' and 'Gaussianized:alpha_r',
type = "s":
'Gaussianized:gamma'.
They can also be returned as a separate matrix using return.tau.mat =
TRUE. In this case Gaussianize returns a list with elements:
input
Gaussianized input data oldsymbol x (or
oldsymbol u if return.u = TRUE),
tau.mat
matrix
with τ estimates that we used to get x; has same number
of columns as x, and 3, 5, or 6 rows (depending on
type='s', 'h', or 'hh').