Last data update: 2014.03.03

R: Statistical tools for the analysis of ChIP-seq data
CSAR-packageR Documentation

Statistical tools for the analysis of ChIP-seq data

Description

Statistical tools for ChIP-seq data analysis.
The package is oriented to plant organisms, and compatible with standard file formats in the plant research field.

Details

Package: CSAR
Type: Package
Version: 1.0
Date: 2009-11-09
License: Artistic-2.0
LazyLoad: yes

Author(s)

Jose M Muino

Maintainer: Jose M Muino <jose.muino@wur.nl>

References

Muino et al. (submitted). Plant ChIP-seq Analyzer: An R package for the statistcal detection of protein-bound genomic regions.
Kaufmann et al.(2009).Target genes of the MADS transcription factor SEPALLATA3: integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biology; 7(4):e1000090.

Examples


##For this example we will use the a subset of the SEP3 ChIP-seq data (Kaufmann, 2009)
data("CSAR-dataset");
##We calculate the number of hits for each nucleotide posotion for the control and sample. We do that just for chromosome chr1, and for positions 1 to 10kb
nhitsS<-mappedReads2Nhits(sampleSEP3_test,file="sampleSEP3_test",chr=c("CHR1v01212004"),chrL=c(10000))
nhitsC<-mappedReads2Nhits(controlSEP3_test,file="controlSEP3_test",chr=c("CHR1v01212004"),chrL=c(10000))


##We calculate a score for each nucleotide position
test<-ChIPseqScore(control=nhitsC,sample=nhitsS)

##We calculate the candidate read-enriched regions
win<-sigWin(test)

##We generate a wig file of the results to visualize tehm in a genome browser
score2wig(test,file="test.wig")

##We calculate relative positions of read-enriched regions regarding gene position
d<-distance2Genes(win=win,gff=TAIR8_genes_test)

##We calculate table of genes with read-enriched regions, and their location
genes<-genesWithPeaks(d)

##We calculate two sets of read-enrichment scores through permutation
permutatedWinScores(nn=1,sample=sampleSEP3_test,control=controlSEP3_test,fileOutput="test",chr=c("CHR1v01212004"),chrL=c(100000))
permutatedWinScores(nn=2,sample=sampleSEP3_test,control=controlSEP3_test,fileOutput="test",chr=c("CHR1v01212004"),chrL=c(100000))

###Next function will get all permutated score values generated by permutatedWinScores function. 
##This represent the score distribution under the null hypotesis and therefore it can be use to control the error of our test.
nulldist<-getPermutatedWinScores(file="test",nn=1:2)

##From this distribution, several cut-off values can be calculated to control the error of our test. 
##Several functions  in R can be used for this purpose.
##In this package we had implemented a simple method for the control of the error based on FDR"
getThreshold(winscores=values(win)$score,permutatedScores=nulldist,FDR=.01)


Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(CSAR)
Loading required package: S4Vectors
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit


Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/CSAR/CSAR-package.Rd_%03d_medium.png", width=480, height=480)
> ### Name: CSAR-package
> ### Title: Statistical tools for the analysis of ChIP-seq data
> ### Aliases: CSAR-package
> 
> ### ** Examples
> 
> 
> ##For this example we will use the a subset of the SEP3 ChIP-seq data (Kaufmann, 2009)
> data("CSAR-dataset");
> ##We calculate the number of hits for each nucleotide posotion for the control and sample. We do that just for chromosome chr1, and for positions 1 to 10kb
> nhitsS<-mappedReads2Nhits(sampleSEP3_test,file="sampleSEP3_test",chr=c("CHR1v01212004"),chrL=c(10000))
mappedReads2Nhits has just finished   CHR1v01212004 ...
> nhitsC<-mappedReads2Nhits(controlSEP3_test,file="controlSEP3_test",chr=c("CHR1v01212004"),chrL=c(10000))
mappedReads2Nhits has just finished   CHR1v01212004 ...
> 
> 
> ##We calculate a score for each nucleotide position
> test<-ChIPseqScore(control=nhitsC,sample=nhitsS)
CHR1v01212004  done...
> 
> ##We calculate the candidate read-enriched regions
> win<-sigWin(test)
CHR1v01212004 done...
> 
> ##We generate a wig file of the results to visualize tehm in a genome browser
> score2wig(test,file="test.wig")
CHR1v01212004 done...
> 
> ##We calculate relative positions of read-enriched regions regarding gene position
> d<-distance2Genes(win=win,gff=TAIR8_genes_test)
Starting CHR1v01212004 ...
> 
> ##We calculate table of genes with read-enriched regions, and their location
> genes<-genesWithPeaks(d)
> 
> ##We calculate two sets of read-enrichment scores through permutation
> permutatedWinScores(nn=1,sample=sampleSEP3_test,control=controlSEP3_test,fileOutput="test",chr=c("CHR1v01212004"),chrL=c(100000))
mappedReads2Nhits has just finished   CHR1v01212004 ...
mappedReads2Nhits has just finished   CHR1v01212004 ...
CHR1v01212004  done...
CHR1v01212004 done...
Win file for permutation 1 can be found at test-1.permutatedWin
> permutatedWinScores(nn=2,sample=sampleSEP3_test,control=controlSEP3_test,fileOutput="test",chr=c("CHR1v01212004"),chrL=c(100000))
mappedReads2Nhits has just finished   CHR1v01212004 ...
mappedReads2Nhits has just finished   CHR1v01212004 ...
CHR1v01212004  done...
CHR1v01212004 done...
Win file for permutation 2 can be found at test-2.permutatedWin
> 
> ###Next function will get all permutated score values generated by permutatedWinScores function. 
> ##This represent the score distribution under the null hypotesis and therefore it can be use to control the error of our test.
> nulldist<-getPermutatedWinScores(file="test",nn=1:2)
Read 66 items
Read 70 items
> 
> ##From this distribution, several cut-off values can be calculated to control the error of our test. 
> ##Several functions  in R can be used for this purpose.
> ##In this package we had implemented a simple method for the control of the error based on FDR"
> getThreshold(winscores=values(win)$score,permutatedScores=nulldist,FDR=.01)
   threshold Error_type_I FDR
21      3.29            0   0
Warning message:
In getThreshold(winscores = values(win)$score, permutatedScores = nulldist,  :
  The number of permutated scores is low.
> 
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>