a data.frame or matrix that contains the locations of sample and control BAM
files, respectively, in columns. The file locations in a row represent a sample and its corresponding
control, which is used for calling peaks.
destination.folder
the path to the folder to which output should be written. The path can be
either absolute or relative.
reference.folder
the path to the folder with the helper files generated by
preCopywriteR(). The helper files include the bin, mappability,
GC-content, and blacklist, files in .bed format.
bp.param
a BiocParallelParam instance (see BiocParallel Bioconductor pacakage)
that determines the settings used for parallel computing. Please refer to the
vignette for more information.
capture.regions.file
optional; the path to the capture regions file, which should be in .bed
format. Overlapping bait regions should be reduced into single regions. If
included, statistics on the overlap of peaks called by MACS and the capture
regions will be provided.
keep.intermediary.files
optional; logical that indicates whether intermediary .bam, .bai and peak
regions files should be kept after the analysis is done. Defaults to FALSE.
Details
CopywriteR uses off-target sequence reads from targeted sequencing to create
copy number profiles. First, it removes non-random off-target reads, and it
subsequently calculates the depth of coverage for the bins that are provided in
the helper files. It then performs GC-content and mappability corrections, and
removes blacklisted regions. plotCNA() generates a DNA copy number
profile from the output of the CopywriteR() function. Helper files
required for CopywriteR analysis can be created using preCopywriteR().
Value
BamBaiPeaksFiles
a folder with the .bam, .bai and peak regions files that are created during
the CopywriteR() run. The folder will only be created when the
keep.intermediary.files argument is set to TRUE.
input.Rdata
an R object that contains information for plotCNA().
log2_read_counts.igv
the file that contains the compensated corrected read counts after
GC-content and mappability corrections, and after removal of data points in
blacklisted regions. Counts are log2-transformed. The file is a
tab-separated file formatted to be viewed in the IGV genome browser.
CopywriteR.log
log file of CopywriteR.
qc
a folder with quality control files. The .png files contain the plots and
the loesses that are used for GC-content and mappability corrections. The
fraction_of-bin .pdf files display the empirical cumulative distribution
function for the fraction of bin (the bin size after removal of peak regions
expressed as a fraction of the original size).
read_counts.txt
the file that contains the raw and compensated read counts per bin.
Author(s)
Thomas Kuilman (t.kuilman@nki.nl)
References
CopywriteR: DNA copy number detection from off-target sequence data. Thomas
Kuilman, Arno Velds, Kristel Kemper, Marco Ranzani, Lorenzo Bombardelli,
Marlous Hoogstraat, Ekaterina Nevedomskaya, Guotai Xu, Julian de Ruiter,
Martijn P. Lolkema, Bauke Ylstra, Jos Jonkers, Sven Rottenberg, Lodewyk F.
Wessels, David J. Adams, Daniel S. Peeper, Oscar Krijgsman. Submitted for
publication.
Examples
## Not run:
setwd("/PATH/TO/BAMFILES/")
samples <- list.files(pattern = ".bam$", full.names = TRUE)
## Use the first .bam file as a control for every sample
# controls <- samples[rep(1, length(samples))]
## Use every sample as its own control (i.e., peaks are called on sample itself)
controls <- samples
sample.control <- data.frame(samples, controls)
CopywriteR(sample.control = sample.control, destination.folder =
"/PATH/TO/DESTINATIONFOLDER/", reference.folder =
"/PATH/TO/REFERENCEFOLDER", ncpu = nrow(sample.control),
capture.regions.file <- "/PATH/TO/CAPTUREREGIONSFILE")
## End(Not run)