Last data update: 2014.03.03

R: Correct GC content and mappability biases in sequencing data...
correctReadDepthR Documentation

Correct GC content and mappability biases in sequencing data read counts

Description

Correct GC content and mappability biases in tumour sequence read counts using Loess curve fitting. Wrapper for function in HMMcopy.

Usage

  correctReadDepth(tumWig, normWig, gcWig, mapWig, genomeStyle = "NCBI", targetedSequence = NULL)

Arguments

tumWig

File path to fixedStep WIG format file for the tumour sample. See wigToRangedData in the HMMcopy for more details.

normWig

File path to fixedStep WIG format file for the normal sample.

gcWig

File path to fixedStep WIG format file for the GC content based on the specific reference genome sequence used.

mapWig

File path to fixedStep WIG format file for the mappability scores computed on the specific reference genome used.

genomeStyle

The genome style to use for chromosomes by TitanCNA. Use one of ‘NCBI’ or ‘UCSC’. It does not matter what style is found in inCounts, genomeStyle will be the style returned.

targetedSequence

data.frame with 3 columns: chr, start position, stop position. Use this argument for exome capture sequencing or targeted deep sequencing data. This is experimental and may not work as desired.

Details

Wrapper for correctReadcount in HMMcopy package. It uses a sampling of 50000 bins to find the Loess fit. Then, the log ratio for every bin is returned as the log base 2 of the ratio between the corrected tumour read count and the corrected normal read count.

Value

data.frame containing columns:

chr

Chromosome; uses 'X' and 'Y' for sex chromosomes

start

Start genomic coordinate for bin in which read count is corrected

end

End genomic coordinate for bin in which read count is corrected

logR

Log ratio, log2(tumour:normal), for bin in which read count is corrected

Author(s)

Gavin Ha <gavinha@gmail.com>, Daniel Lai <jujubix@cs.ubc.ca>, Yikan Wang <ykwang@bccrc.ca>

References

Ha, G., Roth, A., Lai, D., Bashashati, A., Ding, J., Goya, R., Giuliany, R., Rosner, J., Oloumi, A., Shumansky, K., Chin, S.F., Turashvili, G., Hirst, M., Caldas, C., Marra, M. A., Aparicio, S., and Shah, S. P. (2012). Integrative analysis of genome wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple negative breast cancer. Genome Research, 22(10):1995,2007. (PMID: 22637570)

Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., Biele, J., Ding, J., Le, A., Rosner, J., Shumansky, K., Marra, M. A., Huntsman, D. G., McAlpine, J. N., Aparicio, S. A. J. R., and Shah, S. P. (2014). TITAN: Inference of copy number architectures in clonal cell populations from tumour whole genome sequence data. Genome Research, 24: 1881-1893. (PMID: 25060187)

See Also

correctReadcount and wigToRangedData in the HMMcopy package. WIG: http://genome.ucsc.edu/goldenPath/help/wiggle.html

Examples

tumWig <- system.file("extdata", "test_tum_chr2.wig", package = "TitanCNA")
normWig <- system.file("extdata", "test_norm_chr2.wig", package = "TitanCNA")
gc <- system.file("extdata", "gc_chr2.wig", package = "TitanCNA")
map <- system.file("extdata", "map_chr2.wig", package = "TitanCNA")

#### GC AND MAPPABILITY CORRECTION ####
cnData <- correctReadDepth(tumWig, normWig, gc, map)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(TitanCNA)
Loading required package: foreach
Loading required package: IRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/TitanCNA/correctReadDepth.Rd_%03d_medium.png", width=480, height=480)
> ### Name: correctReadDepth
> ### Title: Correct GC content and mappability biases in sequencing data
> ###   read counts
> ### Aliases: correctReadDepth
> ### Keywords: IO manip
> 
> ### ** Examples
> 
> tumWig <- system.file("extdata", "test_tum_chr2.wig", package = "TitanCNA")
> normWig <- system.file("extdata", "test_norm_chr2.wig", package = "TitanCNA")
> gc <- system.file("extdata", "gc_chr2.wig", package = "TitanCNA")
> map <- system.file("extdata", "map_chr2.wig", package = "TitanCNA")
> 
> #### GC AND MAPPABILITY CORRECTION ####
> cnData <- correctReadDepth(tumWig, normWig, gc, map)
Reading GC and mappability files
Slurping: /home/ddbj/local/lib64/R/library/TitanCNA/extdata/gc_chr2.wig
Parsing: fixedStep chrom=2 start=1 step=1000 span=1000
Sorting by decreasing chromosome size
Slurping: /home/ddbj/local/lib64/R/library/TitanCNA/extdata/map_chr2.wig
Parsing: fixedStep chrom=2 start=1 step=1000 span=1000
Sorting by decreasing chromosome size
Loading tumour file:/home/ddbj/local/lib64/R/library/TitanCNA/extdata/test_tum_chr2.wig
Slurping: /home/ddbj/local/lib64/R/library/TitanCNA/extdata/test_tum_chr2.wig
Parsing: fixedStep chrom=2 start=1 step=1000 span=1000
Sorting by decreasing chromosome size
Loading normal file:/home/ddbj/local/lib64/R/library/TitanCNA/extdata/test_norm_chr2.wig
Slurping: /home/ddbj/local/lib64/R/library/TitanCNA/extdata/test_norm_chr2.wig
Parsing: fixedStep chrom=2 start=1 step=1000 span=1000
Sorting by decreasing chromosome size
Correcting Tumour
Applying filter on data...
Correcting for GC bias...
Correcting for mappability bias...
Correcting Normal
Applying filter on data...
Correcting for GC bias...
Correcting for mappability bias...
Normalizing Tumour by Normal
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>