Compute the correlation matrix between two variables, or more (between all
columns of a matrix or data frame).
Usage
correlation(x, ...)
## S3 method for class 'formula'
correlation(formula, data = NULL, subset, na.action, ...)
## Default S3 method:
correlation(x, y = NULL, use = "everything",
method = c("pearson", "kendall", "spearman"), ...)
is.correlation(x)
as.correlation(x)
## S3 method for class 'correlation'
print(x, digits = 3, cutoff = 0, ...)
## S3 method for class 'correlation'
summary(object, cutpoints = c(0.3, 0.6, 0.8, 0.9, 0.95),
symbols = c(" ", ".", ",", "+", "*", "B"), ...)
## S3 method for class 'summary.correlation'
print(x, ...)
## S3 method for class 'correlation'
plot(x, y = NULL, outline = TRUE,
cutpoints = c(0.3, 0.6, 0.8, 0.9, 0.95), palette = rwb.colors, col = NULL,
numbers = TRUE, digits = 2, type = c("full", "lower", "upper"),
diag = (type == "full"), cex.lab = par("cex.lab"), cex = 0.75 * par("cex"),
...)
Arguments
x
a numeric vector, matrix or data frame (or any object for
is.correlation(), or as.correlation()).
formula
a formula with no response variable, referring only to numeric
variables.
data
an optional data frame (or similar: see model.frame)
containing the variables in the formula formula. By default the
variables are taken from environment(formula).
subset
an optional vector used to select rows (observations) of the
data matrix x.
na.action
a function which indicates what should happen when the data
contain NAs. The default is set by the na.action setting of
options, and is na.fail if that is unset. The
'factory-fresh' default is na.omit.
method
a character string indicating which correlation coefficient is
to be computed. One of "pearson" (default), "kendall", or
"spearman", can be abbreviated.
y
NULL (default), or a vector, matrix or data frame with
compatible dimensions to x for correlation(). The default is
equivalent to x = y, but more efficient. For plot.correlation(),
if a second 'correlation' object is provided in y, then a visual
comparison of two correlation matrices is performed (not implemented yet)!
use
an optional character string giving a method for computing
correlations in the presence of missing values. This must be (an abbreviation
of) one of the strings "everything", "all.obs",
"complete.obs", "na.or.complete", or "pairwise.complete.obs".
digits
digits to print after the decimal separator.
cutoff
correlation coefficients lower than this (in absolute value) are
suppressed.
object
a 'correlation' object.
cutpoints
the cut points to use for categories. Specify only positive
values (absolute value of correlation coefficients are summarized, or
negative equivalents are automatically computed for the graph. Do not include
0 or 1 in the cutpoints).
symbols
the symbols to use to summarize the correlation matrix.
outline
do we draw the outline of the ellipse?
palette
a function that can produce a palette of colors.
col
color of the ellipse. If NULL (default), the colors will be
computed using cutpoints and palette.
numbers
do we print correlation values in the center of the ellipses?
type
do we plot a complete matrix, or only lower or upper triangle?
diag
do we plot items on the diagonal? They have always a correlation
of one.
cex.lab
the expansion factor for labels.
cex
the expansion factor for text.
...
further arguments passed to functions.
Value
correlation() and as.correlation() create a 'correlation' object,
while is.correlation() tests for it.
There are print() and summary() methods for the 'correlation'
object that differ in the symbolic encoding of the correlations in
summary(), using symnum, which makes large correlation
matrices more readable.
The method plot returns nothing, but it draws ellipses on a graph that
represent the correlation matrix visually. This is essentially the
plotcorr() function from package ellipse, with slightly different
default arguments and with default cutpoints equivalent to those used
in the summary method.
Author(s)
Philippe Grosjean <phgrosjean@sciviews.org>, wrapping code in package ellipse,
function plotcorr() for the plot.correlation() method.
See Also
cov, cov2cor, cov.wt,
symnum, plotcorr and look also at
panel.cor
Examples
## This is a simple correlation coefficient
cor(rnorm(10), runif(10))
## but this is a 'correlation' object containing a correlation matrix
correlation(rnorm(10), runif(10))
## 'correlation' objects allow better inspection of the correlation matrices
## than the output of default R cor() function
(longley.cor <- correlation(longley))
summary(longley.cor) # Synthetic view of the correlation matrix
plot(longley.cor) # Graphical representation
## Use of the formula interface
(mtcars.cor <- correlation(~ mpg + cyl + disp + hp, data = mtcars,
method = "spearman", na.action = "na.omit"))
mtcars.cor2 <- correlation(mtcars, method = "spearman")
print(mtcars.cor2, cutoff = 0.6)
summary(mtcars.cor2)
plot(mtcars.cor2, type = "lower")
mtcars.cor2["mpg", "cyl"] # Extract one correlation from the correlation matrix
## TODO: a plot comparing two correlation matrices
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(SciViews)
Loading required package: MASS
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/SciViews/correlation.Rd_%03d_medium.png", width=480, height=480)
> ### Name: correlation
> ### Title: Correlation matrices
> ### Aliases: correlation correlation.formula correlation.default
> ### is.correlation as.correlation print.correlation summary.correlation
> ### print.summary.correlation plot.correlation
> ### Keywords: distribution
>
> ### ** Examples
>
> ## This is a simple correlation coefficient
> cor(rnorm(10), runif(10))
[1] 0.719619
> ## but this is a 'correlation' object containing a correlation matrix
> correlation(rnorm(10), runif(10))
Matrix of Pearson's product-moment correlation:
(calculation uses everything)
x y
x 1.000 0.244
y 0.244 1.000
>
> ## 'correlation' objects allow better inspection of the correlation matrices
> ## than the output of default R cor() function
> (longley.cor <- correlation(longley))
Matrix of Pearson's product-moment correlation:
(calculation uses everything)
GNP.deflator GNP Unemployed Armed.Forces Population Year
GNP.deflator 1.000 0.992 0.621 0.465 0.979 0.991
GNP 0.992 1.000 0.604 0.446 0.991 0.995
Unemployed 0.621 0.604 1.000 -0.177 0.687 0.668
Armed.Forces 0.465 0.446 -0.177 1.000 0.364 0.417
Population 0.979 0.991 0.687 0.364 1.000 0.994
Year 0.991 0.995 0.668 0.417 0.994 1.000
Employed 0.971 0.984 0.502 0.457 0.960 0.971
Employed
GNP.deflator 0.971
GNP 0.984
Unemployed 0.502
Armed.Forces 0.457
Population 0.960
Year 0.971
Employed 1.000
> summary(longley.cor) # Synthetic view of the correlation matrix
Matrix of Pearson's product-moment correlation:
(calculation uses everything)
GNP. GNP U A P Y E
GNP.deflator 1
GNP B 1
Unemployed , , 1
Armed.Forces . . 1
Population B B , . 1
Year B B , . B 1
Employed B B . . B B 1
attr(,"legend")
[1] 0 ' ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1
> plot(longley.cor) # Graphical representation
>
> ## Use of the formula interface
> (mtcars.cor <- correlation(~ mpg + cyl + disp + hp, data = mtcars,
+ method = "spearman", na.action = "na.omit"))
Matrix of Spearman's rank correlation rho:
(missing values are managed with na.omit)
mpg cyl disp hp
mpg 1.000 -0.911 -0.909 -0.895
cyl -0.911 1.000 0.928 0.902
disp -0.909 0.928 1.000 0.851
hp -0.895 0.902 0.851 1.000
>
> mtcars.cor2 <- correlation(mtcars, method = "spearman")
> print(mtcars.cor2, cutoff = 0.6)
Matrix of Spearman's rank correlation rho:
(calculation uses everything)
mpg cyl disp hp drat wt qsec vs am gear
mpg 1.000 -0.911 -0.909 -0.895 0.651 -0.886 0.707
cyl -0.911 1.000 0.928 0.902 -0.679 0.858 -0.814
disp -0.909 0.928 1.000 0.851 -0.684 0.898 -0.724 -0.624
hp -0.895 0.902 0.851 1.000 0.775 -0.667 -0.752
drat 0.651 -0.679 -0.684 1.000 -0.750 0.687 0.745
wt -0.886 0.858 0.898 0.775 -0.750 1.000 -0.738 -0.676
qsec -0.667 1.000 0.792
vs 0.707 -0.814 -0.724 -0.752 0.792 1.000
am -0.624 0.687 -0.738 1.000 0.808
gear 0.745 -0.676 0.808 1.000
carb -0.657 0.733 -0.659 -0.634
carb
mpg -0.657
cyl
disp
hp 0.733
drat
wt
qsec -0.659
vs -0.634
am
gear
carb 1.000
> summary(mtcars.cor2)
Matrix of Spearman's rank correlation rho:
(calculation uses everything)
m cy ds h dr w q v a g cr
mpg 1
cyl * 1
disp * * 1
hp + * + 1
drat , , , . 1
wt + + + , , 1
qsec . . . , 1
vs , + , , . . , 1
am . . , . , , 1
gear . . . . , , + 1
carb , . . , . , , 1
attr(,"legend")
[1] 0 ' ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1
> plot(mtcars.cor2, type = "lower")
>
> mtcars.cor2["mpg", "cyl"] # Extract one correlation from the correlation matrix
[1] -0.9108013
> ## TODO: a plot comparing two correlation matrices
>
>
>
>
>
> dev.off()
null device
1
>