R: Visual and statistical Gaussianity check
Visual and statistical Gaussianity check


Graphical and statistical check if data is Gaussian (three common Normality tests, QQ-plots, histograms, etc).

test_normality does not show the autocorrelation function (ACF) estimate for lag 0, since it always equals 1. Thus removing it does not lose any information, but greatly improves the y-axis scale for higher order lags (which are usually very small compared to 1).

test_norm is a shortcut for test_normality.


test_normality(data, show.volatility = FALSE, plot = TRUE, pch = 1,
  add.legend = TRUE, seed = sample(1e+06, 1))




a numeric vector of data values.


logical; if TRUE the squared (centered) data and its ACF are also shown. Useful for time series data to see if squares exhibit dependence (for financial data they typically do); default: FALSE.


Should visual checks (histogram, densities, qqplot, ACF) be plotted? Default TRUE; otherwise only hypothesis test results are returned.


a vector of plotting characters or symbols; default pch = 1.


logical; if TRUE (default) a legend is placed in histogram/density plot.


optional; if sample size > 5,000, then some normality tests fail to run. In this case it uses a subsample of size 5,000. For reproducibility, the seed can be specified by user. By default it uses a random seed.


arguments as in test_normality.


A list with results of 3 normality tests (each of class htest) and the seed used for subsampling:


Anderson Darling (if nortest package is available),


Shapiro-Francia (if nortest package is available),




seed for subsampling (only used if sample size > 5,000).


Thode Jr., H.C. (2002): “Testing for Normality”. Marcel Dekker, New York.

See Also

shapiro.test in the stats package; ad.test, sf.test in the nortest package.


y <- rLambertW(n = 1000, theta = list(beta = c(3, 4), gamma = 0.3),
               distname = "normal")

x <- rnorm(n = 1000)

# mixture of exponential and normal
test_normality(c(rexp(100), rnorm(100, mean = -5)))


> y <- rLambertW(n = 1000, theta = list(beta = c(3, 4), gamma = 0.3),
+                distname = "normal")
> test_normality(y)
[1] 887720


	Shapiro-Wilk normality test

data:  data.test
W = 0.87294, p-value < 2.2e-16


	Shapiro-Francia normality test

data:  data.test
W = 0.87256, p-value < 2.2e-16


	Anderson-Darling normality test

data:  data
A = 33.294, p-value < 2.2e-16

> x <- rnorm(n = 1000)
> test_normality(x)
[1] 801073


	Shapiro-Wilk normality test

data:  data.test
W = 0.99694, p-value = 0.05206


	Shapiro-Francia normality test

data:  data.test
W = 0.99698, p-value = 0.05304


	Anderson-Darling normality test

data:  data
A = 0.64439, p-value = 0.09253

> # mixture of exponential and normal
> test_normality(c(rexp(100), rnorm(100, mean = -5)))
[1] 899389


	Shapiro-Wilk normality test

data:  data.test
W = 0.88261, p-value = 2.242e-11


	Shapiro-Francia normality test

data:  data.test
W = 0.88611, p-value = 7.808e-10


	Anderson-Darling normality test

data:  data
A = 11.329, p-value < 2.2e-16

null device 