Last data update: 2014.03.03
R: REDseq
REDseq-package R Documentation
REDseq
Description
REDSeq is a Bioconductor package for building genomic map of restriction enzyme sites REmap, assigning sequencing tags to RE sites using five different strategies, visualizing genome-wide distribution of differentially cut regions with the REmap as reference and the distance distribution of sequence tags to corresponding RE sites, generating count table for identifying statistically significant RE sites using edgeR or DEseq.
Details
Package: REDseq
Type: Package
Version: 1.0
Date: 2011-05-10
License: GPL
LazyLoad: yes
~~ An overview of how to use the package, including the most important functions ~~
Author(s)
Lihua Julie Zhu
Maintainer:
Lihua Julie Zhu <julie.zhu@umassmed.edu>
References
1. Roberts, R.J., Restriction endonucleases. CRC Crit Rev Biochem, 1976. 4(2): p. 123-64.
2. Kessler, C. and V. Manta, Specificity of restriction endonucleases and DNA modification methyltransferases a review (Edition 3). Gene, 1990. 92(1-2): p. 1-248.
3. Pingoud, A., J. Alves, and R. Geiger, Restriction enzymes. Methods Mol Biol, 1993. 16: p. 107-200.
4. Anders, S. and W. Huber, Differential expression analysis for sequence count data. Genome Biol, 2010. 11(10): p. R106.
5. Robinson, M.D., D.J. McCarthy, and G.K. Smyth, edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 2010. 26(1): p. 139-40.
6. Zhu, L.J., et al., ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data. BMC Bioinformatics, 2010. 11: p. 237.
7. Pages, H., BSgenome package. http://bioconductor.org/packages/2.8/bioc/
vignettes/BSgenome/inst/doc/GenomeSearching.pdf
8. Zhu, L.J., et al., REDseq: A Bioconductor package for Analyzing High Throughput Sequencing Data from Restriction Enzyme Digestion. (In preparation)
See Also
buildREmap, assignSeq2REsit, plotCutDistribution, distanceHistSeq2RE, summarizeByRE, summarizeBySeq, compareREseq, binom.test.REDseq
Examples
if(interactive()){
library(ChIPpeakAnno)
REpatternFilePath = system.file("extdata", "examplePattern.fa", package="REDseq")
library(BSgenome.Celegans.UCSC.ce2)
buildREmap( REpatternFilePath, BSgenomeName=Celegans, outfile=tempfile())
library(REDseq)
data(example.REDseq)
data(example.map)
r.unique = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
max.FragmentLength = 300, partitionMultipleRE = "unique")
r.average = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
max.FragmentLength = 300, partitionMultipleRE = "average")
r.random = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
max.FragmentLength = 300, partitionMultipleRE = "random")
r.best = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
max.FragmentLength = 300, partitionMultipleRE = "best")
r.estimate = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
max.FragmentLength = 300, partitionMultipleRE = "estimate")
r.estimate$passed.filter
r.estimate$notpassed.filter
data(example.assignedREDseq)
plotCutDistribution(example.assignedREDseq,example.map,
chr="2", xlim =c(3012000, 3020000))
distanceHistSeq2RE(example.assignedREDseq,ylim=c(0,20))
summarizeByRE(example.assignedREDseq,by="Weight",sampleName="example")
REsummary =summarizeByRE(example.assignedREDseq,by="Weight")
binom.test.REDseq(REsummary)
}
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(REDseq)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: BSgenome.Celegans.UCSC.ce2
Loading required package: BSgenome
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
Loading required package: rtracklayer
Loading required package: multtest
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: ChIPpeakAnno
Loading required package: grid
Loading required package: VennDiagram
Loading required package: futile.logger
No methods found in "IRanges" for requests: %in%
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/REDseq/REDseq-package.Rd_%03d_medium.png", width=480, height=480)
> ### Name: REDseq-package
> ### Title: REDseq
> ### Aliases: REDseq-package REDseq
> ### Keywords: package
>
> ### ** Examples
>
> # if(interactive()){
> library(ChIPpeakAnno)
> REpatternFilePath = system.file("extdata", "examplePattern.fa", package="REDseq")
> library(BSgenome.Celegans.UCSC.ce2)
> buildREmap( REpatternFilePath, BSgenomeName=Celegans, outfile=tempfile())
>>> Finding all hits in sequences chrI ...
>>> DONE searching
>>> Finding all hits in sequences chrII ...
>>> DONE searching
>>> Finding all hits in sequences chrIII ...
>>> DONE searching
>>> Finding all hits in sequences chrIV ...
>>> DONE searching
>>> Finding all hits in sequences chrV ...
>>> DONE searching
>>> Finding all hits in sequences chrX ...
>>> DONE searching
>>> Finding all hits in sequences chrM ...
>>> DONE searching
RangedData with 87959 rows and 2 value columns across 7 spaces
space ranges | strand score
<factor> <IRanges> | <character> <numeric>
ExampleRE.+.chrI.1 I [ 1899, 1903] | + 1
ExampleRE.+.chrI.2 I [ 6356, 6360] | + 1
ExampleRE.+.chrI.3 I [ 6775, 6779] | + 1
ExampleRE.+.chrI.4 I [ 6866, 6870] | + 1
ExampleRE.+.chrI.5 I [11431, 11435] | + 1
ExampleRE.+.chrI.6 I [11975, 11979] | + 1
ExampleRE.+.chrI.7 I [12241, 12245] | + 1
ExampleRE.+.chrI.8 I [13311, 13315] | + 1
ExampleRE.+.chrI.9 I [14083, 14087] | + 1
... ... ... ... ... ...
ExampleRE.+.chrX.15068 X [17706097, 17706101] | + 1
ExampleRE.+.chrX.15069 X [17709329, 17709333] | + 1
ExampleRE.+.chrX.15070 X [17710564, 17710568] | + 1
ExampleRE.+.chrX.15071 X [17711033, 17711037] | + 1
ExampleRE.+.chrX.15072 X [17712037, 17712041] | + 1
ExampleRE.+.chrX.15073 X [17712181, 17712185] | + 1
ExampleRE.+.chrX.15074 X [17714984, 17714988] | + 1
ExampleRE.+.chrX.15075 X [17716980, 17716984] | + 1
ExampleRE.+.chrX.15076 X [17717264, 17717268] | + 1
> library(REDseq)
> data(example.REDseq)
> data(example.map)
> r.unique = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
+ seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
+ max.FragmentLength = 300, partitionMultipleRE = "unique")
Wed Jul 6 03:11:35 2016 Validating input ...
Wed Jul 6 03:11:35 2016 Prepare map data ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:35 2016 Finished 1st round of aligning! Start the 2nd round of aligning ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:35 2016 Start filtering ...
> r.average = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
+ seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
+ max.FragmentLength = 300, partitionMultipleRE = "average")
Wed Jul 6 03:11:35 2016 Validating input ...
Wed Jul 6 03:11:35 2016 Prepare map data ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:35 2016 Finished 1st round of aligning! Start the 2nd round of aligning ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:35 2016 Start filtering ...
Wed Jul 6 03:11:35 2016 Partitioning reads over RE sites within 300 ...
> r.random = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
+ seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
+ max.FragmentLength = 300, partitionMultipleRE = "random")
Wed Jul 6 03:11:35 2016 Validating input ...
Wed Jul 6 03:11:35 2016 Prepare map data ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:35 2016 Finished 1st round of aligning! Start the 2nd round of aligning ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:35 2016 Start filtering ...
Wed Jul 6 03:11:35 2016 Partitioning reads over RE sites within 300 ...
> r.best = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
+ seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
+ max.FragmentLength = 300, partitionMultipleRE = "best")
Wed Jul 6 03:11:35 2016 Validating input ...
Wed Jul 6 03:11:35 2016 Prepare map data ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:35 2016 Finished 1st round of aligning! Start the 2nd round of aligning ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:35 2016 Start filtering ...
Wed Jul 6 03:11:35 2016 Partitioning reads over RE sites within 300 ...
Wed Jul 6 03:11:35 2016 get count for each RE ...
> r.estimate = assignSeq2REsite(example.REDseq, example.map, cut.offset = 1,
+ seq.length = 36, allowed.offset = 5, min.FragmentLength = 60,
+ max.FragmentLength = 300, partitionMultipleRE = "estimate")
Wed Jul 6 03:11:35 2016 Validating input ...
Wed Jul 6 03:11:35 2016 Prepare map data ...
Wed Jul 6 03:11:35 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:36 2016 Finished 1st round of aligning! Start the 2nd round of aligning ...
Wed Jul 6 03:11:36 2016 Align to chromosome 2 ...
Wed Jul 6 03:11:36 2016 Start filtering ...
Wed Jul 6 03:11:36 2016 Partitioning reads over RE sites within 300 ...
Wed Jul 6 03:11:36 2016 get count for each RE ...
> r.estimate$passed.filter
SEQid REid Chr strand SEQstart SEQend REstart REend
1 00000036 Sau96I.chr10.29 2 -1 3012058 3012093 3012090 3012094
2 00000037 Sau96I.chr10.29 2 1 3012091 3012126 3012090 3012094
3 00000038 Sau96I.chr10.29 2 1 3012096 3012131 3012090 3012094
4 00000039 Sau96I.chr10.30 2 -1 3012266 3012301 3012299 3012303
5 00000040 Sau96I.chr10.30 2 1 3012300 3012335 3012299 3012303
6 00000052 Sau96I.chr10.40 2 -1 3017881 3017916 3017916 3017920
7 00000054 Sau96I.chr10.42 2 -1 3018313 3018348 3018314 3018318
8 00000055 Sau96I.chr10.42 2 1 3018315 3018350 3018314 3018318
9 00000056 Sau96I.chr10.42 2 1 3018315 3018350 3018314 3018318
10 00000057 Sau96I.chr10.42 2 1 3018315 3018350 3018314 3018318
11 00000058 Sau96I.chr10.42 2 1 3018315 3018350 3018314 3018318
12 00000059 Sau96I.chr10.42 2 1 3018315 3018350 3018314 3018318
13 00000060 Sau96I.chr10.42 2 1 3018315 3018350 3018314 3018318
14 00000061 Sau96I.chr10.42 2 1 3018316 3018351 3018314 3018318
15 00000065 Sau96I.chr10.43 2 -1 3019211 3019246 3019246 3019250
16 00000066 Sau96I.chr10.43 2 -1 3019214 3019249 3019246 3019250
17 00000067 Sau96I.chr10.43 2 1 3019247 3019282 3019246 3019250
18 00000068 Sau96I.chr10.43 2 1 3019247 3019282 3019246 3019250
19 00000073 Sau96I.chr10.45 2 -1 3019428 3019463 3019460 3019464
20 00000074 Sau96I.chr10.45 2 -1 3019428 3019463 3019460 3019464
21 00000075 Sau96I.chr10.47 2 1 3020085 3020120 3020083 3020087
22 00000076 Sau96I.chr10.49 2 1 3020111 3020146 3020110 3020114
23 00000079 Sau96I.chr10.50 2 1 3020358 3020393 3020355 3020359
24 00000080 Sau96I.chr10.50 2 1 3020361 3020396 3020355 3020359
25 00000084 Sau96I.chr10.51 2 -1 3020458 3020493 3020480 3020484
91 00000072 Sau96I.chr10.43 2 1 3019402 3019437 3019246 3019250
110 00000041 Sau96I.chr10.29 2 1 3012334 3012369 3012090 3012094
26 00000041 Sau96I.chr10.30 2 1 3012334 3012369 3012299 3012303
31 00000053 Sau96I.chr10.40 2 -1 3018089 3018124 3017916 3017920
41 00000053 Sau96I.chr10.42 2 -1 3018089 3018124 3018314 3018318
51 00000070 Sau96I.chr10.43 2 1 3019293 3019328 3019246 3019250
61 00000070 Sau96I.chr10.45 2 1 3019293 3019328 3019460 3019464
71 00000071 Sau96I.chr10.43 2 -1 3019322 3019357 3019246 3019250
81 00000071 Sau96I.chr10.45 2 -1 3019322 3019357 3019460 3019464
92 00000077 Sau96I.chr10.47 2 1 3020246 3020281 3020083 3020087
101 00000077 Sau96I.chr10.49 2 1 3020246 3020281 3020110 3020114
111 00000077 Sau96I.chr10.51 2 1 3020246 3020281 3020480 3020484
121 00000077 Sau96I.chr10.50 2 1 3020246 3020281 3020355 3020359
131 00000078 Sau96I.chr10.47 2 1 3020291 3020326 3020083 3020087
141 00000078 Sau96I.chr10.50 2 1 3020291 3020326 3020355 3020359
151 00000078 Sau96I.chr10.49 2 1 3020291 3020326 3020110 3020114
161 00000078 Sau96I.chr10.51 2 1 3020291 3020326 3020480 3020484
171 00000081 Sau96I.chr10.50 2 -1 3020382 3020417 3020355 3020359
181 00000081 Sau96I.chr10.51 2 -1 3020382 3020417 3020480 3020484
Distance Weight
1 -32 1.0000000
2 1 1.0000000
3 6 1.0000000
4 -33 1.0000000
5 1 1.0000000
6 -35 1.0000000
7 -1 1.0000000
8 1 1.0000000
9 1 1.0000000
10 1 1.0000000
11 1 1.0000000
12 1 1.0000000
13 1 1.0000000
14 2 1.0000000
15 -35 1.0000000
16 -32 1.0000000
17 1 1.0000000
18 1 1.0000000
19 -32 1.0000000
20 -32 1.0000000
21 2 1.0000000
22 1 1.0000000
23 3 1.0000000
24 6 1.0000000
25 -22 1.0000000
91 156 1.0000000
110 244 0.6000000
26 35 0.4000000
31 173 0.1111111
41 -225 0.8888889
51 47 0.6666667
61 -167 0.3333333
71 76 0.6666667
81 -138 0.3333333
92 163 0.2000000
101 136 0.2000000
111 -234 0.2000000
121 -109 0.4000000
131 208 0.2000000
141 -64 0.4000000
151 181 0.2000000
161 -189 0.2000000
171 27 0.6666667
181 -98 0.3333333
> r.estimate$notpassed.filter
[1] SEQid REid Chr strand SEQstart SEQend REstart REend
[9] Distance Weight
<0 rows> (or 0-length row.names)
> data(example.assignedREDseq)
> plotCutDistribution(example.assignedREDseq,example.map,
+ chr="2", xlim =c(3012000, 3020000))
> distanceHistSeq2RE(example.assignedREDseq,ylim=c(0,20))
> summarizeByRE(example.assignedREDseq,by="Weight",sampleName="example")
REid example
Sau96I.chr10.29 "Sau96I.chr10.29" "4"
Sau96I.chr10.30 "Sau96I.chr10.30" "2"
Sau96I.chr10.40 "Sau96I.chr10.40" "1"
Sau96I.chr10.42 "Sau96I.chr10.42" "9"
Sau96I.chr10.43 "Sau96I.chr10.43" "6"
Sau96I.chr10.45 "Sau96I.chr10.45" "3"
Sau96I.chr10.47 "Sau96I.chr10.47" "1"
Sau96I.chr10.49 "Sau96I.chr10.49" "1"
Sau96I.chr10.50 "Sau96I.chr10.50" "3"
Sau96I.chr10.51 "Sau96I.chr10.51" "2"
> REsummary =summarizeByRE(example.assignedREDseq,by="Weight")
> binom.test.REDseq(REsummary)
p.value total.weight.count REid cut.frequency
1 2.804822e-47 9 Sau96I.chr10.42 0.28125
2 9.061718e-31 6 Sau96I.chr10.43 0.18750
3 3.595919e-20 4 Sau96I.chr10.29 0.12500
4 4.959892e-15 3 Sau96I.chr10.50 0.09375
5 4.959892e-15 3 Sau96I.chr10.45 0.09375
6 4.959901e-10 2 Sau96I.chr10.30 0.06250
7 4.959901e-10 2 Sau96I.chr10.51 0.06250
8 3.199950e-05 1 Sau96I.chr10.40 0.03125
9 3.199950e-05 1 Sau96I.chr10.49 0.03125
10 3.199950e-05 1 Sau96I.chr10.47 0.03125
BH.adjusted.p.value
1 2.804822e-46
2 4.530859e-30
3 1.198640e-19
4 9.919784e-15
5 9.919784e-15
6 7.085573e-10
7 7.085573e-10
8 3.199950e-05
9 3.199950e-05
10 3.199950e-05
> #}
>
>
>
>
>
> dev.off()
null device
1
>