R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Plot correlation of random pairs of genes

plot.corr.sample

R Documentation

Plot correlation of random pairs of genes

Description

plot.corr.sample provides the main functionality of package maCorrPlot: it plots the correlation of random pairs of genes against their variability. Systematic deviations of the plot from a constant zero indicate lack of normalization of the underlying expression matrix.

Formally, plot.corr.sample is the plotting method for objects of class corr.sample generated by CorrSample.

panel.corr.sample is the panel function that does the actual plotting work.

Usage

plot.corr.sample(x, ..., cond, groups, grid = TRUE, refline = TRUE, xlog = TRUE,
                 scatter = FALSE, curve = FALSE, ci = TRUE, nint = 10,
				 alpha=0.95, length = 0.1, xlab="Standard Deviation")

panel.corr.sample(x, y, grid = TRUE, refline = TRUE, xlog = TRUE, 
                  scatter = FALSE, curve = FALSE, ci = TRUE, nint = 10, 
				  alpha=0.95, length = 0.1, col.line, col.symbol, ...)

Arguments

`x, y`	for `plot.corr.sample`, `x` is an object of class `corr.sample`, generated by function `CorrSample` that contains the pre-computed correlations and standard deviations for the random pairs of genes; for `panel.corr.sample`, `x` and `y` are the x- and y-components (or standard deviation and correlation) of the pairs of genes to be plotted in a specific panel.
`...`	either more objects of class `corr.sample` or plotting arguments passed to the underlying `xyplot`.
`cond`	either a vector or a list of vectors describing multiple objects of class `corr.sample`; ignored if only one such object (`x`) is specified. See Details and Examples.
`groups`	a vector or a list of vectors giving group membership for the random pairs of genes in the `corr.sample` objects to be plotted, resulting in multiple overlayed plots for each object. See Details and Examples.
`grid`	logical value indicating whether to draw a reference grid
`refline`	logical value indicaitng whether to draw a horizontal reference line a zero.
`xlog`	logical value indicating whether to use log-scale on the horizontal axis.
`scatter`	logical value indicaitng whether the plot the individual pairwise correlations.
`curve`	logical value indicating whether to fit a simple model for lack of fit to the correlations.
`ci`	logical value indicating whether to add confidence intervals.
`nint`	number of intervals into which to divide the horizontal axis for calculating average correlations.
`alpha`	the level of confidence to be plotted.
`length`	the length of the horizontal ticks indicating the ends of the confidence intervals (in inches).
`xlab`	the label for the horizontal axis.
`col.line, col.symbol`	graphical parameters that control the color of the correlation lines and the scatter plotting symbols

Details

The underlying plotting engine is xyplot, using panel.corr.sample as panel function, which also interprets most of the graphical parameters. Note that two kinds of arguments can be specified via ...: First, an unlimited number of extra corr.sample objects, in case we want to display different expression measures for the same expression matrix, or compare different expression matrices, or both; this is somewhat similar to the behaviour of boxplot.default. Second, everything that does not inherit from corr.sample is passed on to xyplot, so in theory, the full range of lattice control options is available, as long as they do not conflixt with named arguments to plot.corr.sample, like xlog or xlab.

Two mechanisms for comparisons within the same plot are available: First, as mentioned above, multiple corr.sample objects can be shown in the same graph, each within its own panel. If no cond is specified, these panels are just numbered in the order in which the objects appear in the arguments. Alternatively, one or two factors can be associated with each factor: in the first case, cond is just a vector with as many entries as corr.sample objects in the argument list; these entries are used to label the panels of the corresponding corr.sample objects. In the second case, cond is a list with two such vectors, and the objects are cross-classified according to both categories, and the panels are arranged in a row-column pattern reflecting this cross-classification, see Examples.

The other mechanism for graphical comparisons within the same plot is via groups, which draws different correlation curves for different sub-groups of pairs of genes; the standard example is to classify pairs of genes according to their common or average score in regard to a quality control measure like the MAS5 presence calls, see Examples. These sub-groups are specified via groups; if there is only one corr.sample object in the function call (x), groups is just a vector with as many entries as there are random paris of genes in x. If several objects of class corr.sample have been specified in the function call, groups is a list of as many vectors as objects, where each vector has as many entries as the corresponding object has pairs of genes.

Value

A plot created by xyplot.

Warning

cond is translated into conditioning variables for xyplot, which will not hesitate to average correlations across different corr.sample objects. It's hard to see when this would be a good idea, therefore plot.corr.sample will generate a warning.

Author(s)

Alexander Ploner Alexander.Ploner@ki.se

References

Ploner A, Miller LD, Hall P, Bergh J, Pawitan Y. Correlation test to assess low-level processing of high-density oligonucleotide microarray data. BMC Bioinformatics, 2005, 6(1):80 http://www.pubmedcentral.gov/articlerender.fcgi?tool=pubmed&pubmedid=15799785

Examples

# Get small example data
data(oligodata)
dim(datA.rma)
dim(datB.rma)

# Compute the correlations for 500 random pairs, 
# Larger numbers are reasonable for larger data sets
cs1.rma = CorrSample(datA.rma, 500, seed=210)
plot(cs1.rma)

# Change the plot
plot(cs1.rma, scatter=TRUE, curve=TRUE, alpha=0.99)

# Compare with MAS5 values for the same data set
cs1.mas5 = CorrSample(datA.mas5, 500, seed=210)
plot(cs1.rma, cs1.mas5, cond=c("RMA","MAS5"))

# We group pairs of gene by their average number of MAS5 present calls
pcntA = rowSums(datA.amp[cs1.mas5$ndx1, ]=="P") +
        rowSums(datA.amp[cs1.mas5$ndx2, ]=="P")
hist(pcntA)
pgrpA = cut(pcntA, c(0, 20, 40, 60), include.lowest=TRUE)
table(pgrpA)

# Plot the RMA values according to their MAS5 status 
# The artificial correlation is due to gene pairs with few present calls
plot(cs1.rma, groups=pgrpA, nint=5, auto.key=TRUE, ylim=c(-0.3, 0.5))

# Combine grouping and multiple conditions
plot(cs1.rma, cs1.mas5, cond=c("RMA","MAS5"), groups=list(pgrpA, pgrpA), 
     nint=5, auto.key=TRUE, ylim=c(-0.3, 0.5))

# Compare with second data set
# Specify more than one condition
cs2.rma  = CorrSample(datB.rma, 500, seed=391)
cs2.mas5 = CorrSample(datB.mas5, 500, seed=391)
plot(cs1.rma, cs1.mas5, cs2.rma, cs2.mas5,     
     cond=list(c("RMA","MAS5","RMA","MAS5"), c("A","A","B","B")))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(maCorrPlot)
Loading required package: lattice
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/maCorrPlot/plot.corr.sample.Rd_%03d_medium.png", width=480, height=480)
> ### Name: plot.corr.sample
> ### Title: Plot correlation of random pairs of genes
> ### Aliases: plot.corr.sample panel.corr.sample
> ### Keywords: hplot
> 
> ### ** Examples
> 
> # Get small example data
> data(oligodata)
> dim(datA.rma)
[1] 1000   30
> dim(datB.rma)
[1] 1000   30
> 
> # Compute the correlations for 500 random pairs, 
> # Larger numbers are reasonable for larger data sets
> cs1.rma = CorrSample(datA.rma, 500, seed=210)
> plot(cs1.rma)
> 
> # Change the plot
> plot(cs1.rma, scatter=TRUE, curve=TRUE, alpha=0.99)
> 
> # Compare with MAS5 values for the same data set
> cs1.mas5 = CorrSample(datA.mas5, 500, seed=210)
> plot(cs1.rma, cs1.mas5, cond=c("RMA","MAS5"))
> 
> # We group pairs of gene by their average number of MAS5 present calls
> pcntA = rowSums(datA.amp[cs1.mas5$ndx1, ]=="P") +
+         rowSums(datA.amp[cs1.mas5$ndx2, ]=="P")
> hist(pcntA)
> pgrpA = cut(pcntA, c(0, 20, 40, 60), include.lowest=TRUE)
> table(pgrpA)
pgrpA
 [0,20] (20,40] (40,60] 
    157     213     130 
> 
> # Plot the RMA values according to their MAS5 status 
> # The artificial correlation is due to gene pairs with few present calls
> plot(cs1.rma, groups=pgrpA, nint=5, auto.key=TRUE, ylim=c(-0.3, 0.5))
> 
> # Combine grouping and multiple conditions
> plot(cs1.rma, cs1.mas5, cond=c("RMA","MAS5"), groups=list(pgrpA, pgrpA), 
+      nint=5, auto.key=TRUE, ylim=c(-0.3, 0.5))
> 
> # Compare with second data set
> # Specify more than one condition
> cs2.rma  = CorrSample(datB.rma, 500, seed=391)
> cs2.mas5 = CorrSample(datB.mas5, 500, seed=391)
> plot(cs1.rma, cs1.mas5, cs2.rma, cs2.mas5,     
+      cond=list(c("RMA","MAS5","RMA","MAS5"), c("A","A","B","B")))
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>