This is an utility function to create a data frame.
The data frame contains binding sites merged by peaks from two conditions,
count ChIP read counts, smoothing control counts for each candidate region,
and indicate the common peaks from two conditions.
A data frame that represents the ChIP experiments information.
It contains 6 columns,sampleID,condition,factor,ipReads,ctReads,peaks.
condition refers to treatment condition or cell line;
factor refers to transcription factor or histone modification;
ipReads is the ChIP sequence data in bam or bed format;
ctReads is the control sequence data in bam or bed format;
peaks is the called peaks from existing peak-calling software.
design
Two column design matrix. The number of rows equals number of ChIP samples from two conditions.
The first column are all 1s, which indicates intercept in regression model.
The second column are 1s for one condition and 0s for another condition.
filetype
Two sequence file types are supported (bed or bam).
species
Two species are supported (hg19 or mm9).
peak.center
This argument is coupled with peak.ext. Default is FALSE. The argument
is used when centered regions of peaks are more of interest.
peak.ext
This argument is coupled with peak.center. Default is 0.
binsize
binsize in bp to calculate the smooth local lambda in poisson distribution. The default is 50bp.
mva.span
1 kb, 5 kb or 10 kb window centered at the peak location in the control sample.
Value
A object ChIPComp.
Column chr,start,end are the binding site genomic coordinate;
Column ip_c(#condition)_r(#replicate) indicates the ChIP counts in #replicate in #condition;
Column ct_c(#condition)_r(#replicate) indicates the smoothing control counts in #replicate in #condition;
Column commonPeak indicates the common binding sites.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(ChIPComp)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: rtracklayer
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/ChIPComp/makeCountSet.Rd_%03d_medium.png", width=480, height=480)
> ### Name: makeCountSet
> ### Title: make differential binding sites data frame
> ### Aliases: makeCountSet
>
> ### ** Examples
>
> conf=data.frame(
+ SampleID=1:4,
+ condition=c("Helas3","Helas3","K562","K562"),
+ factor=c("H3k27ac","H3k27ac","H3k27ac","H3k27ac"),
+ ipReads=system.file("extdata",c("Helas3.ip1.bed","Helas3.ip2.bed","K562.ip1.bed","K562.ip2.bed"),package="ChIPComp"),
+ ctReads=system.file("extdata",c("Helas3.ct.bed","Helas3.ct.bed","K562.ct.bed","K562.ct.bed"),package="ChIPComp"),
+ peaks=system.file("extdata",c("Helas3.peak.bed","Helas3.peak.bed","K562.peak.bed","K562.peak.bed"),package="ChIPComp")
+ )
> design=as.data.frame(lapply(conf[,c("condition","factor")],as.numeric))-1
> design=as.data.frame(model.matrix(~condition,design))
> countSet=makeCountSet(conf,design,filetype="bed", species="hg19",binsize=1000)
Making peak list......
Making ip counts......
Making control counts......
>
>
>
>
>
> dev.off()
null device
1
>