R: The sample size weighted correlation may be used in...
cor.wt
R Documentation
The sample size weighted correlation may be used in correlating aggregated data
Description
If using aggregated data, the correlation of the means does not reflect the sample size used for each mean. cov.wt in RCore does this and returns a covariance matrix or the correlation matrix. The cor.wt function weights by sample size or by standard errors and by default return correlations.
Usage
cor.wt(data,vars=NULL, w=NULL,sds=NULL, cor=TRUE)
Arguments
data
A matrix or data frame
vars
Variables to analyze
w
A set of weights (e.g., the sample sizes)
sds
Standard deviations of the samples (used if weighting by standard errors)
cor
Report correlations (the default) or covariances
Details
A weighted correlation is just ∑ (wt_k * (x_ik - x_jk)) /sqrt[wt_k ∑(x^2_ik) wt_k ∑(x^2_jk)] where x_ik is a deviation from the weighted mean.
The weighted correlation is appropriate for correlating aggregated data, where individual data points might reflect the means of a number of observations. In this case, each point is weighted by its sample size (or alternatively, by the standard error). If the weights are all equal, the correlation is just a normal Pearson correlation.
Used when finding correlations of group means found using statsBy.
Value
cor
The weighted correlation
xwt
The data as weighted deviations from the weighted mean
wt
The weights used (calculated from the sample sizes).
mean
The weighted means
xc
Unweighted, centered deviation scores from the weighted mean
xs
Deviation scores weighted by the standard error of each sample mean
Note
A generalization of cov.wt in core R
Author(s)
William Revelle
See Also
See Also as cov.wt, statsBy
Examples
means.by.age <- statsBy(sat.act,"age")
wt.cors <- cor.wt(means.by.age)
lowerMat(wt.cors$r) #show the weighted correlations
unwt <- lowerCor(means.by.age$mean)
mixed <- lowerUpper(unwt,wt.cors$r) #combine both results
cor.plot(mixed,TRUE,main="weighted versus unweighted correlations")
diff <- lowerUpper(unwt,wt.cors$r,TRUE)
cor.plot(diff,TRUE,main="differences of weighted versus unweighted correlations")