R: Calculate pairwise similarities of phenoData between samples...
phenoFinder
R Documentation
Calculate pairwise similarities of phenoData between samples for a list containing two ExpressionSets
Description
This function acts as a wrapper to phenoDist to handle cases of
one ExpressionSet, a list of two identical ExpressionSets, or a
list of two different ExpressionSets.
Usage
phenoFinder(eset.pair, separator = ":", ...)
Arguments
eset.pair
input: a list of ExpressionSets with two elements, or an
ExpressionSet. If the two elements are identical, return the
correlation matrix for pairs of samples in the first element. If
not identical, return pairs between the two elements.
separator
a separator between dataset name (taken from the list names) and
sample name (taken from sampleNames(eset), to keep track of which
samples come from which dataset.
...
Extra arguments passed on to phenoDist
Value
A matrix of similarities between the phenotypes of pairs of samples.
Author(s)
Levi Waldron, Markus Riester, Marcel Ramos
Examples
library(curatedOvarianData)
data(GSE32063_eset)
data(GSE17260_eset)
esets2 <- list(JapaneseB=GSE32063_eset,
Yoshihara2010=GSE17260_eset)
## standardize the sample ids to improve matching based on clinical annotation
esets2 <- lapply(esets2, function(X){
X$alt_sample_name <- paste(X$sample_type, gsub("[^0-9]", "", X$alt_sample_name), sep="_")
## Removal of columns that cannot possibly match also helps duplicated patients to stand out
pData(X) <- pData(X)[, !grepl("uncurated_author_metadata", colnames(pData(X)))]
X <- X[, 1:20] ##speed computations
return(X) })
## See first six samples in both rows and columns
phenoFinder(esets2)[1:6, 1:6]
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(doppelgangR)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: BiocParallel
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/doppelgangR/phenoFinder.Rd_%03d_medium.png", width=480, height=480)
> ### Name: phenoFinder
> ### Title: Calculate pairwise similarities of phenoData between samples for
> ### a list containing two ExpressionSets
> ### Aliases: phenoFinder
>
> ### ** Examples
>
> library(curatedOvarianData)
Loading required package: affy
> data(GSE32063_eset)
> data(GSE17260_eset)
> esets2 <- list(JapaneseB=GSE32063_eset,
+ Yoshihara2010=GSE17260_eset)
>
> ## standardize the sample ids to improve matching based on clinical annotation
> esets2 <- lapply(esets2, function(X){
+ X$alt_sample_name <- paste(X$sample_type, gsub("[^0-9]", "", X$alt_sample_name), sep="_")
+
+ ## Removal of columns that cannot possibly match also helps duplicated patients to stand out
+ pData(X) <- pData(X)[, !grepl("uncurated_author_metadata", colnames(pData(X)))]
+ X <- X[, 1:20] ##speed computations
+ return(X) })
>
> ## See first six samples in both rows and columns
> phenoFinder(esets2)[1:6, 1:6]
GSM432220 GSM432221 GSM432222 GSM432223 GSM432224 GSM432225
GSM795125 0.2351904 0.1014047 0.3525417 0.7274151 0.2189890 0.27397077
GSM795126 0.5404524 0.2588727 0.4083015 0.4079720 0.2927870 0.74123368
GSM795127 0.3791279 0.5008562 0.4983502 0.4981226 0.6385506 0.04416984
GSM795128 0.2351904 0.1014047 0.3525417 0.3523760 0.2189890 0.27397077
GSM795129 0.1076309 0.2395470 0.2190910 0.2189890 0.3643260 0.16030839
GSM795130 0.2603947 0.1344290 0.1077761 0.1076793 0.2489234 0.29544860
>
>
>
>
>
> dev.off()
null device
1
>