R: Calculates a normalized correlation score from ChIP-seq and...
integrateData
R Documentation
Calculates a normalized correlation score from ChIP-seq and microarray
gene expression data.
Description
This function calculates the product of the standardized differences
between two conditions in ChIP-seq data and the respective standardized
differences in gene expression data. A score close to zero means that there
are no (large) differences in at least one of the two data sets. If the
score is positive, equally directed differences exist in both data
sets. In case of a negative score, differences have unequal signs in the
two data sets.
Usage
integrateData(expr, chipseq, factor, reference)
Arguments
expr
An ExpressionSet holding the gene expression data.
chipseq
A ChIPseqSet holding the ChIP-seq data.
factor
A character giving the name of the factor that describes the
conditions to be compared. The factor must be present in the pheno data
slot of the objects expr and chipseq. Further, the factor must have
exactly two levels and the level names must be the same in both objects.
reference
Optionally, the name of the factor level that should be used as
reference. If missing, the first level of factor in the object
expr is used.
Details
Let A and B denote the gene expression value of one probe in the group
of interest and in the reference group defined by the argument
reference. And let X and Y be the ChIP-seq values assigned to
that probe. This functions returnes for each probe
Z = (A-B)/σ_{ge} \times (X-Y)/σ_{chip},
where σ_{ge} is the standard deviation estimated from all
observed difference in the gene expression data and σ_{chip}
the standard deviation in the ChIP-seq data.
If there is more than one sample in any group and data set, the
average of the replicates is calcuated first and than plugged into
the formula above.
Not all features in expr must also be in chipseq and
vice versa. Features present in only one of the two data types are
omitted.
Value
A matrix with five columns. The first 4 columns store the (average)
expression values and the (average) ChIP-seq values for each of the two
conditions. The fith columns store the correlation score. The row names
equal common feature names of expr and chipseq.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(epigenomix)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/epigenomix/integrateData.Rd_%03d_medium.png", width=480, height=480)
> ### Name: integrateData
> ### Title: Calculates a normalized correlation score from ChIP-seq and
> ### microarray gene expression data.
> ### Aliases: integrateData
> ### integrateData,ExpressionSet,ChIPseqSet,character,character-method
> ### integrateData,ExpressionSet,ChIPseqSet,character,missing-method
> ### integrateData,ExpressionSetIllumina,ChIPseqSet,character,character-method
> ### integrateData,ExpressionSetIllumina,ChIPseqSet,character,missing-method
>
> ### ** Examples
>
> ge <- matrix(c(5,12,5,11,11,10,12,11), nrow=2)
> row.names(ge) <- c("100_at", "200_at")
> colnames(ge) <- c("c1", "c2", "t1", "t2")
> geDf <- data.frame(status=c("control", "control", "treated", "treated"),
+ row.names=colnames(ge))
> eSet <- ExpressionSet(ge, phenoData=AnnotatedDataFrame(geDf))
>
> chip <- matrix(c(10,20,20,22), nrow=2)
> row.names(chip) <- c("100_at", "200_at")
> colnames(chip) <- c("c", "t")
> rowRanges <- GRanges(IRanges(start=c(10,50), end=c(20,60)), seqnames=c("1","1"))
> names(rowRanges) = c("100_at", "200_at")
> chipDf <- DataFrame(status=factor(c("control", "treated")),
+ totalCount=c(100, 100),
+ row.names=colnames(chip))
> cSet <- ChIPseqSet(chipVals=chip, rowRanges=rowRanges, colData=chipDf)
>
> integrateData(eSet, cSet, factor="status", reference="control")
expr_treated expr_control chipseq_treated chipseq_control z
100_at 11.5 5.0 20 10 2.16666667
200_at 10.5 11.5 22 20 -0.06666667
>
>
>
>
>
> dev.off()
null device
1
>