Last data update: 2014.03.03

R: Calculate read-enrichment scores for each nucleotide position
ChIPseqScoreR Documentation

Calculate read-enrichment scores for each nucleotide position

Description

Calculate read-enrichment scores for each nucleotide position

Usage

ChIPseqScore(control, sample, backg = -1, file = NA, norm = 3 * 10^9,  test = "Ratio",times=1e6,digits=2)

Arguments

control

data.frame structure obtained by mappedReads2Nhits

sample

data.frame structure obtained by mappedReads2Nhits

backg

Due low coverage in the control, there could be regions with no hits. Any region with a hit value lower than backg in the control will be set to the value of backg

file

Name of the file where you wan to save the results (if desired)

norm

Integer value. Number of hits will be reported by number of hits per norm nucleotides

test

Use a score based on the poisson distribution ("Poisson") or in the ratio ("Ratio")

times

To be memory efficient, CSAR will only upload to the RAM memory fragments of length times. A bigger value means more RAM memory needed but whole process will be faster

digits

Number of decimal digits used to report the score values

Details

Different sequencing efforts yield different number of sequenced reads, for this reason the "number of hits" at each nucleotide position is normalized by the total number of nucleotides sequenced. Subsequently, the number of hits for the sample is normalize to have the same mean and variance than the control, for each chromosome independently or for the whole set of chromosomes (depending of the value of normEachChrInd). Due low coverage, there could be regions with no hits. Any region with a hit value lower than backg in the control will be set to the value of backg For each nucleotide position, a read-enrichment score will be calculated with the Poisson test, or with the ratio.

Value

A list to be used for other functions of the CSAR package

chr

Chromosme names

chrL

Chromosme length (bp)

filenames

Name of the files where the score values are storaged

digits

Score values storaged on the files need to be divided by 10^digits

Author(s)

Jose M Muino, jose.muino@wur.nl

References

Muino et al. (submitted). Plant ChIP-seq Analyzer: An R package for the statistical detection of protein-bound genomic regions.
Kaufmann et al.(2009).Target genes of the MADS transcription factor SEPALLATA3: integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biology; 7(4):e1000090.

See Also

CSAR-package

Examples


##For this example we will use the a subset of the SEP3 ChIP-seq data (Kaufmann, 2009)
data("CSAR-dataset");
##We calculate the number of hits for each nucleotide posotion for the control and sample. We do that just for chromosome chr1, and for positions 1 to 10kb
nhitsS<-mappedReads2Nhits(sampleSEP3_test,file="sampleSEP3_test",chr=c("CHR1v01212004"),chrL=c(10000))
nhitsC<-mappedReads2Nhits(controlSEP3_test,file="controlSEP3_test",chr=c("CHR1v01212004"),chrL=c(10000))


##We calculate a score for each nucleotide position
test<-ChIPseqScore(control=nhitsC,sample=nhitsS)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(CSAR)
Loading required package: S4Vectors
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit


Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/CSAR/ChIPseqScore.Rd_%03d_medium.png", width=480, height=480)
> ### Name: ChIPseqScore
> ### Title: Calculate read-enrichment scores for each nucleotide position
> ### Aliases: ChIPseqScore score_chr
> 
> ### ** Examples
> 
> 
> ##For this example we will use the a subset of the SEP3 ChIP-seq data (Kaufmann, 2009)
> data("CSAR-dataset");
> ##We calculate the number of hits for each nucleotide posotion for the control and sample. We do that just for chromosome chr1, and for positions 1 to 10kb
> nhitsS<-mappedReads2Nhits(sampleSEP3_test,file="sampleSEP3_test",chr=c("CHR1v01212004"),chrL=c(10000))
mappedReads2Nhits has just finished   CHR1v01212004 ...
> nhitsC<-mappedReads2Nhits(controlSEP3_test,file="controlSEP3_test",chr=c("CHR1v01212004"),chrL=c(10000))
mappedReads2Nhits has just finished   CHR1v01212004 ...
> 
> 
> ##We calculate a score for each nucleotide position
> test<-ChIPseqScore(control=nhitsC,sample=nhitsS)
CHR1v01212004  done...
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>