Last data update: 2014.03.03

R: Make a RangedSummarizedExperiment object from an...
makeSummarizedExperimentFromExpressionSetR Documentation

Make a RangedSummarizedExperiment object from an ExpressionSet and vice-versa

Description

Coercion between RangedSummarizedExperiment and ExpressionSet is supported in both directions.

For going from ExpressionSet to RangedSummarizedExperiment, the makeSummarizedExperimentFromExpressionSet function is also provided to let the user control how to map features to ranges.

Usage

makeSummarizedExperimentFromExpressionSet(from,
                                          mapFun=naiveRangeMapper,
                                          ...)

## range mapping functions
naiveRangeMapper(from)
probeRangeMapper(from)
geneRangeMapper(txDbPackage, key = "ENTREZID")

Arguments

from

An ExpressionSet object.

mapFun

A function which takes an ExpressionSet object and returns a GRanges, or GRangesList object which corresponds to the genomic ranges used in the ExpressionSet. The rownames of the returned GRanges are used to match the featureNames of the ExpressionSet.

The naiveRangeMapper function is used by default.

...

Additional arguments passed to mapFun.

txDbPackage

A character string with the Transcript Database to use for the mapping.

key

A character string with the Gene key to use for the mapping.

Value

makeSummarizedExperimentFromExpressionSet takes an ExpressionSet object as input and a range mapping function that maps the features to ranges. It then returns a RangedSummarizedExperiment object that corresponds to the input.

The range mapping functions return a GRanges object, with the rownames corresponding to the featureNames of the ExpressionSet object.

Author(s)

Jim Hester, james.f.hester@gmail.com

See Also

  • RangedSummarizedExperiment objects.

  • ExpressionSet objects in the Biobase package.

  • TxDb objects in the GenomicFeatures package.

Examples

## ---------------------------------------------------------------------
## GOING FROM ExpressionSet TO RangedSummarizedExperiment
## ---------------------------------------------------------------------

data(sample.ExpressionSet, package="Biobase")

# 2 equivalent ways of doing the naive coercion
makeSummarizedExperimentFromExpressionSet(sample.ExpressionSet)
as(sample.ExpressionSet, "RangedSummarizedExperiment")

# using probe range mapper
makeSummarizedExperimentFromExpressionSet(sample.ExpressionSet, probeRangeMapper)

# using the gene range mapper
makeSummarizedExperimentFromExpressionSet(sample.ExpressionSet,
                                          geneRangeMapper("TxDb.Hsapiens.UCSC.hg19.knownGene"))

## ---------------------------------------------------------------------
## GOING FROM RangedSummarizedExperiment TO ExpressionSet
## ---------------------------------------------------------------------

example(RangedSummarizedExperiment)  # to create 'rse'
rse
as(rse, "ExpressionSet")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(SummarizedExperiment)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/SummarizedExperiment/makeSummarizedExperimentFromExpressionSet.Rd_%03d_medium.png", width=480, height=480)
> ### Name: makeSummarizedExperimentFromExpressionSet
> ### Title: Make a RangedSummarizedExperiment object from an ExpressionSet
> ###   and vice-versa
> ### Aliases: makeSummarizedExperimentFromExpressionSet naiveRangeMapper
> ###   probeRangeMapper geneRangeMapper
> ###   coerce,ExpressionSet,RangedSummarizedExperiment-method
> ###   coerce,RangedSummarizedExperiment,ExpressionSet-method
> ###   coerce,SummarizedExperiment,ExpressionSet-method
> ### Keywords: manip
> 
> ### ** Examples
> 
> ## ---------------------------------------------------------------------
> ## GOING FROM ExpressionSet TO RangedSummarizedExperiment
> ## ---------------------------------------------------------------------
> 
> data(sample.ExpressionSet, package="Biobase")
> 
> # 2 equivalent ways of doing the naive coercion
> makeSummarizedExperimentFromExpressionSet(sample.ExpressionSet)
class: RangedSummarizedExperiment 
dim: 500 26 
metadata(3): experimentData annotation protocolData
assays(2): exprs se.exprs
rownames(500): AFFX-MurIL2_at AFFX-MurIL10_at ... 31738_at 31739_at
rowData names(0):
colnames(26): A B ... Y Z
colData names(3): sex type score
> as(sample.ExpressionSet, "RangedSummarizedExperiment")
class: RangedSummarizedExperiment 
dim: 500 26 
metadata(3): experimentData annotation protocolData
assays(2): exprs se.exprs
rownames(500): AFFX-MurIL2_at AFFX-MurIL10_at ... 31738_at 31739_at
rowData names(0):
colnames(26): A B ... Y Z
colData names(3): sex type score
> 
> # using probe range mapper
> makeSummarizedExperimentFromExpressionSet(sample.ExpressionSet, probeRangeMapper)


'select()' returned 1:many mapping between keys and columns
class: RangedSummarizedExperiment 
dim: 354 26 
metadata(3): experimentData annotation protocolData
assays(2): exprs se.exprs
rownames(354): AFFX-HUMISGF3A/M97935_5_at AFFX-HUMISGF3A/M97935_MA_at
  ... 31736_at 31737_at
rowData names(0):
colnames(26): A B ... Y Z
colData names(3): sex type score
Warning messages:
1: In .deprecatedColsMessage() :
  Accessing gene location information via 'CHR','CHRLOC','CHRLOCEND' is
  deprecated. Please use a range based accessor like genes(), or select()
  with columns values like TXCHROM and TXSTART on a TxDb or OrganismDb
  object instead.
2: 146 probes could not be mapped. 
> 
> # using the gene range mapper
> makeSummarizedExperimentFromExpressionSet(sample.ExpressionSet,
+                                           geneRangeMapper("TxDb.Hsapiens.UCSC.hg19.knownGene"))
'select()' returned 1:many mapping between keys and columns
class: RangedSummarizedExperiment 
dim: 331 26 
metadata(3): experimentData annotation protocolData
assays(2): exprs se.exprs
rownames(331): AFFX-HUMISGF3A/M97935_5_at AFFX-HUMISGF3A/M97935_5_at
  ... 31736_at 31737_at
rowData names(1): gene_id
colnames(26): A B ... Y Z
colData names(3): sex type score
Warning message:
169 probes could not be mapped. 
> 
> ## ---------------------------------------------------------------------
> ## GOING FROM RangedSummarizedExperiment TO ExpressionSet
> ## ---------------------------------------------------------------------
> 
> example(RangedSummarizedExperiment)  # to create 'rse'

RngdSE> nrows <- 200; ncols <- 6

RngdSE> counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)

RngdSE> rowRanges <- GRanges(rep(c("chr1", "chr2"), c(50, 150)),
RngdSE+                      IRanges(floor(runif(200, 1e5, 1e6)), width=100),
RngdSE+                      strand=sample(c("+", "-"), 200, TRUE),
RngdSE+                      feature_id=sprintf("ID%03d", 1:200))

RngdSE> colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
RngdSE+                      row.names=LETTERS[1:6])

RngdSE> rse <- SummarizedExperiment(assays=SimpleList(counts=counts),
RngdSE+                             rowRanges=rowRanges, colData=colData)

RngdSE> rse
class: RangedSummarizedExperiment 
dim: 200 6 
metadata(0):
assays(1): counts
rownames: NULL
rowData names(1): feature_id
colnames(6): A B ... E F
colData names(1): Treatment

RngdSE> dim(rse)
[1] 200   6

RngdSE> dimnames(rse)
[[1]]
NULL

[[2]]
[1] "A" "B" "C" "D" "E" "F"


RngdSE> assayNames(rse)
[1] "counts"

RngdSE> head(assay(rse))
            A        B         C        D        E        F
[1,] 9499.761 7031.571 2591.5844 6886.931 8220.292 9631.723
[2,] 2656.991 7371.272 1181.8730 5802.887 6487.127 2664.323
[3,] 6146.817 1909.953  525.4137 5677.161 7359.891 7766.353
[4,] 2121.819 3235.342  984.1765 5254.603 1548.752 7893.282
[5,] 6599.005 8669.629 3060.1926 4653.174 4853.539 3418.082
[6,] 8099.391 3158.599  913.8810 4225.905 7331.335 3044.385

RngdSE> assays(rse) <- endoapply(assays(rse), asinh)

RngdSE> head(assay(rse))
            A        B        C        D        E        F
[1,] 9.852169 9.551313 8.553172 9.530528 9.707508 9.865965
[2,] 8.578097 9.598493 7.768003 9.359258 9.470722 8.580852
[3,] 9.416837 8.247981 6.957334 9.337354 9.596948 9.650703
[4,] 8.353176 8.775037 7.584953 9.260007 8.038352 9.666914
[5,] 9.487821 9.760728 8.719380 9.138452 9.180611 8.829982
[6,] 9.692691 8.751031 7.510848 9.042136 9.593060 8.714201

RngdSE> rowRanges(rse)
GRanges object with 200 ranges and 1 metadata column:
        seqnames           ranges strand |  feature_id
           <Rle>        <IRanges>  <Rle> | <character>
    [1]     chr1 [597744, 597843]      + |       ID001
    [2]     chr1 [412119, 412218]      - |       ID002
    [3]     chr1 [461106, 461205]      + |       ID003
    [4]     chr1 [497343, 497442]      + |       ID004
    [5]     chr1 [707754, 707853]      + |       ID005
    ...      ...              ...    ... .         ...
  [196]     chr2 [521527, 521626]      - |       ID196
  [197]     chr2 [438383, 438482]      + |       ID197
  [198]     chr2 [404815, 404914]      + |       ID198
  [199]     chr2 [862832, 862931]      + |       ID199
  [200]     chr2 [625920, 626019]      + |       ID200
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

RngdSE> rowData(rse)  # same as 'mcols(rowRanges(rse))'
DataFrame with 200 rows and 1 column
     feature_id
    <character>
1         ID001
2         ID002
3         ID003
4         ID004
5         ID005
...         ...
196       ID196
197       ID197
198       ID198
199       ID199
200       ID200

RngdSE> colData(rse)
DataFrame with 6 rows and 1 column
    Treatment
  <character>
A        ChIP
B       Input
C        ChIP
D       Input
E        ChIP
F       Input

RngdSE> rse[, rse$Treatment == "ChIP"]
class: RangedSummarizedExperiment 
dim: 200 3 
metadata(0):
assays(1): counts
rownames: NULL
rowData names(1): feature_id
colnames(3): A C E
colData names(1): Treatment

RngdSE> ## cbind() combines objects with the same ranges but different samples:
RngdSE> rse1 <- rse

RngdSE> rse2 <- rse1[,1:3]

RngdSE> colnames(rse2) <- letters[1:ncol(rse2)] 

RngdSE> cmb1 <- cbind(rse1, rse2)

RngdSE> dim(cmb1)
[1] 200   9

RngdSE> dimnames(cmb1)
[[1]]
NULL

[[2]]
[1] "A" "B" "C" "D" "E" "F" "a" "b" "c"


RngdSE> ## rbind() combines objects with the same samples but different ranges:
RngdSE> rse1 <- rse

RngdSE> rse2 <- rse1[1:50,]

RngdSE> rownames(rse2) <- letters[1:nrow(rse2)] 

RngdSE> cmb2 <- rbind(rse1, rse2)

RngdSE> dim(cmb2)
[1] 250   6

RngdSE> dimnames(cmb2)
[[1]]
  [1] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
 [19] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
 [37] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
 [55] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
 [73] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
 [91] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
[109] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
[127] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
[145] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
[163] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
[181] ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
[199] ""  ""  "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p"
[217] "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" NA  NA  NA  NA  NA  NA  NA  NA 
[235] NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA 

[[2]]
[1] "A" "B" "C" "D" "E" "F"


RngdSE> ## Coercion to/from SummarizedExperiment:
RngdSE> se0 <- as(rse, "SummarizedExperiment")

RngdSE> se0
class: SummarizedExperiment 
dim: 200 6 
metadata(0):
assays(1): counts
rownames: NULL
rowData names(1): feature_id
colnames(6): A B ... E F
colData names(1): Treatment

RngdSE> as(se0, "RangedSummarizedExperiment")
class: RangedSummarizedExperiment 
dim: 200 6 
metadata(0):
assays(1): counts
rownames: NULL
rowData names(0):
colnames(6): A B ... E F
colData names(1): Treatment

RngdSE> ## Setting rowRanges on a SummarizedExperiment object turns it into a
RngdSE> ## RangedSummarizedExperiment object:
RngdSE> se <- se0

RngdSE> rowRanges(se) <- rowRanges

RngdSE> se  # RangedSummarizedExperiment
class: RangedSummarizedExperiment 
dim: 200 6 
metadata(0):
assays(1): counts
rownames: NULL
rowData names(1): feature_id
colnames(6): A B ... E F
colData names(1): Treatment

RngdSE> ## Sanity checks:
RngdSE> stopifnot(identical(assays(se0), assays(rse)))

RngdSE> stopifnot(identical(dim(se0), dim(rse)))

RngdSE> stopifnot(identical(dimnames(se0), dimnames(rse)))

RngdSE> stopifnot(identical(rowData(se0), rowData(rse)))

RngdSE> stopifnot(identical(colData(se0), colData(rse)))
> rse
class: RangedSummarizedExperiment 
dim: 200 6 
metadata(0):
assays(1): counts
rownames: NULL
rowData names(1): feature_id
colnames(6): A B ... E F
colData names(1): Treatment
> as(rse, "ExpressionSet")
ExpressionSet (storageMode: environment)
assayData: 200 features, 6 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: A B ... F (6 total)
  varLabels: Treatment
  varMetadata: labelDescription
featureData
  featureNames: 1 2 ... 200 (200 total)
  fvarLabels: seqnames start ... feature_id (6 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  
Warning message:
In asMethod(object) :
  No assay named 'exprs' found, renaming counts to 'exprs'.
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>