R: A function assigning promoter regions to given probe IDs.
matchProbeToPromoter
R Documentation
A function assigning promoter regions to given probe IDs.
Description
This function returns a GRangesList object asigning promoter regions
to probes. The assignment of transcripts to probes and the
transcriptional start sites must be given as arguments.
A list with character vectors as elements. The elements' names are
probe IDs and the character vectors store the transcript IDs assigned to
that probe.
transcriptToTSS
A data.frame with four columns:
Transcript ID as given in the argument probeToTranscript
Chromosome
Transcriptional start site in base pairs
Strand
promWidth
Width of the promoter regions in base pairs. Promoters are defined
as promWidth base pairs upstream of the transcriptional start
site. (default 4000bp)
mode
How probes with multiple transcripts should be handled. Must be either
"union", "keepAll" or "dropMultiple". (default "union")
fix
Denotes what to use as anchor when defining the promoter
region. Must be either "center", "start" or "end". "Center" means
that the TSS is in the middle of the promoter, whereas "end" means
that the promoter is placed upsream of the TSS. (default "center")
Details
More than one transcript can be assigned to one probe in the given
probeToTranscript argument. Several options how to handle
such cases can be choosen by argument mode. "union":
The union of all promoters is calculated and assigned to the probe.
"keepAll": All promoters of all transcripts are assigned to the
probe. If some transcript have identical TSSs, the same promoter
region occurs several times. "dropMultiple": All probes that have
more than one transcript with different TSS are removed.
The argument transcriptToTSS must have at least 4 columns
giving the information as described above. The column names are not
decisive, but their position.
Value
An object of class GRangesList with one element for each probe.
If mode is not set to "dropMultiple", GRanges may consist
of more than one range. The names of the lists' elements are the probe
IDs and additionally, each GRanges has a meta data column
"probe" giving the corresponding probe ID.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(epigenomix)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/epigenomix/matchProbeToPromoter.Rd_%03d_medium.png", width=480, height=480)
> ### Name: matchProbeToPromoter
> ### Title: A function assigning promoter regions to given probe IDs.
> ### Aliases: matchProbeToPromoter
> ### matchProbeToPromoter,list,data.frame-method
>
> ### ** Examples
>
> probeToTrans <- list("101"="ENST00011",
+ "102"=c("ENST00021", "ENST00022"),
+ "103"=NA)
> transToTSS <- data.frame(
+ transID=c("ENST00011", "ENST00021", "ENST00022"),
+ chr=c("1", "1", "1"),
+ tss=c(100000, 200000, 201000),
+ strand=c("-", "+", "+"))
>
> matchProbeToPromoter(probeToTrans, transToTSS,
+ promWidth=4000, mode="union")
GRangesList object of length 2:
$101
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | probe
<Rle> <IRanges> <Rle> | <character>
[1] 1 [98000, 101999] - | 101
$102
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | probe
[1] 1 [198000, 202999] + | 102
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> matchProbeToPromoter(probeToTrans, transToTSS,
+ promWidth=4000, mode="keepAll")
GRangesList object of length 2:
$101
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | probe
<Rle> <IRanges> <Rle> | <character>
[1] 1 [98000, 101999] - | 101
$102
GRanges object with 2 ranges and 1 metadata column:
seqnames ranges strand | probe
[1] 1 [198000, 201999] + | 102
[2] 1 [199000, 202999] + | 102
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
>
>
>
>
>
> dev.off()
null device
1
>