Last data update: 2014.03.03

R: Use Procrustes to adjust an MDS map containing samples...
procrustesAdjR Documentation

Use Procrustes to adjust an MDS map containing samples obtained under different conditions, e.g. technology or genetic backgrounds.

Description

The function adjusts a previous mds to take into account that samples were obtained under different conditions, e.g. technological or genetic. Pairwise adjustments are performed by identifying samples present in both conditions and using Procrustes. When there are more than two conditions, sequential pairwise adjustments are applied (in the order that maximizes the number of common samples in each pairwise adjustment).

Usage

procrustesAdj(mds1, d, adjust, sampleid)

Arguments

mds1

Object of class mds with a Multi-dimensional scaling analysis on a distance matrix, typically obtained by a previous call to mds.

d

Object of class distGPS with the matrix used to create the Multidimensional Scaling object usually through a call to mds.

adjust

Vector indicating the adjustment factor, i.e. the condition under which each sample has been obtained.

sampleid

Vector containing the sample identifier. sampleid should take the same value for samples obtained under different conditions, as this is used to detect the samples to be used for Procrustes adjustment.

Details

We implement the Procrustes adjustment as follows. First we identify common samples, i.e. those obtained both under conditions A and B. Second, we use Procrustes to estimate the shift, scale and rotation that best matches the position of the samples in B to those in A. If only 1 sample was obtained under both conditions, only the shift is estimated. Last, we apply the estimated shift, scale and rotation to all B samples. That is, the Procruses parameters are estimated using common samples only, which are then applied to all samples to perform the adjustment.

Notice that the R square of the adjusted mds is typically improved after Procrustes adjustment, since distances between samples obtained under different conditions are set to NA and therefore MDS needs to approximate distances between less points.

When several replicates are available for a given sampleid under the same condition (adjust), the average position of all replicates is used.

Value

Adjusted mds object. Have in mind that only original distances between samples obtained under the same condition should be conserved, as the adjusted distances manipulated by Procrustes no longer correlate with the distances between their points in the adjusted MDS.

Methods

signature(x='mds')

x is a mds object with the results of an MDS analysis.

See Also

distGPS for computing distances, mds to create MDS-oriented objects.

Examples

st1 <- runif(100,1,1000); st2 <- runif(100,500,1500)  #Peak starts
st3 <- runif(100,1000,2000); st4 <- runif(100,1500,2000)
#cond1: more precise technology
cond1 <- RangedDataList(s1=RangedData(IRanges(st1,st1+100)),s2=RangedData(IRanges(st2,st2+100)),s3=RangedData(IRanges(st3,st3+100)))
#cond2: less precise
cond2 <- RangedDataList(s1=RangedData(IRanges(st1-200,st1+300)),s2=RangedData(IRanges(st2-200,st2+300)),s5=RangedData(IRanges(st4-200,st4+300)))
x <- c(cond1,cond2)
d <- distGPS(x,metric='tanimoto')  #compute distances
mds1 <- mds(d)  #MDS
#Adjust via Procrustes
mds2 <- procrustesAdj(mds1,d,adjust=rep(c('seq','chip'),each=3),sampleid=names(x))
plot(mds1)
plot(mds2)
#Adjust via peak width
xadj <- adjustPeaks(x,adjust=rep(c('seq','chip'),each=3),sampleid=names(x))
dadj <- distGPS(xadj)
mds3 <- mds(dadj)
plot(mds3)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(chroGPS)
Loading required package: IRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: MASS
Loading required package: changepoint
Loading required package: zoo

Attaching package: 'zoo'

The following objects are masked from 'package:base':

    as.Date, as.Date.numeric

Successfully loaded changepoint package version 2.2.1
 NOTE: Predefined penalty values have changed.  Previous penalty values with a postfix 1 i.e. SIC1 are now without i.e. SIC and previous penalties without a postfix i.e. SIC are now with a postfix 0 i.e. SIC0. See NEWS and help files for further details.
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/chroGPS/procrustesAdj.Rd_%03d_medium.png", width=480, height=480)
> ### Name: procrustesAdj
> ### Title: Use Procrustes to adjust an MDS map containing samples obtained
> ###   under different conditions, e.g. technology or genetic backgrounds.
> ### Aliases: procrustesAdj procrustesAdj-methods
> ###   procrustesAdj,mds,distGPS-method
> ### Keywords: multivariate,cluster
> 
> ### ** Examples
> 
> st1 <- runif(100,1,1000); st2 <- runif(100,500,1500)  #Peak starts
> st3 <- runif(100,1000,2000); st4 <- runif(100,1500,2000)
> #cond1: more precise technology
> cond1 <- RangedDataList(s1=RangedData(IRanges(st1,st1+100)),s2=RangedData(IRanges(st2,st2+100)),s3=RangedData(IRanges(st3,st3+100)))
Warning message:
RangedDataList objects are deprecated in favor of GRangesList objects
  (the GRangesList class is defined in the GenomicRanges package). 
> #cond2: less precise
> cond2 <- RangedDataList(s1=RangedData(IRanges(st1-200,st1+300)),s2=RangedData(IRanges(st2-200,st2+300)),s5=RangedData(IRanges(st4-200,st4+300)))
Warning message:
RangedDataList objects are deprecated in favor of GRangesList objects
  (the GRangesList class is defined in the GenomicRanges package). 
> x <- c(cond1,cond2)
> d <- distGPS(x,metric='tanimoto')  #compute distances
> mds1 <- mds(d)  #MDS
> #Adjust via Procrustes
> mds2 <- procrustesAdj(mds1,d,adjust=rep(c('seq','chip'),each=3),sampleid=names(x))
> plot(mds1)
> plot(mds2)
> #Adjust via peak width
> xadj <- adjustPeaks(x,adjust=rep(c('seq','chip'),each=3),sampleid=names(x))
> dadj <- distGPS(xadj)
> mds3 <- mds(dadj)
> plot(mds3)
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>