R: Differential Expression and Enrichment Analysis with...
GeoDE-package
R Documentation
Differential Expression and Enrichment Analysis with
(Geo)metrical(D)ifferential(E)expression.
Description
This package contains functions for performing multivariate analysis of geneome-wide expression data and also enrichment analysis.
Details
Package:
GeoDE
Type:
Package
Version:
1.0
Date:
2014-06-06
License:
GPL-2
Given gene expression data from two classes (e.g. controll verses perturbed samples) with biological replicates in each class, this package can be used to extract the most significant genes and gene-sets.
Differential expression is characterised with a single direction in expression space, which can be interpreted to extract the most signiicant genes: this is achieved with the chdirAnalysis function.
Once the characeristic direction has been calculated gene-set enrichmnet can be evaluated using the PAEAAnalysis function. The user is free to use any library of gene-sets, however, included in this package is a broad range of gene-set libraries listed below:
BioCarta_pathways.gmt
Cancer_Cell_Line_Encyclopedia.gmt
ChEA.gmt
Chromosome_location.gmt
CORUM.gmt.gmt
GeneOntology_BP.gmt
GeneOntology_CC.gmt
GeneOntology_MF.gmt
GeneSigDB.gmt
Genome_Browser_PWMs.gmt
HMDB_Metabolites.gmt
Human_Gene_Atlas.gmt
KEA.gmt
KEGG_pathways.gmt
MGI_MP_top3.gmt
MGI_MP_top4.gmt
microRNA.gmt
Mouse_Gene_Atlas.gmt
NCI60.gmt
NURSA-IPMS.gmt
OMIM_disease_genes.gmt
OMIM_Expanded.gmt
Pfam-InterPro-domains.gmt
PPI_Hub_Proteins.gmt
Reactome_pathways.gmt
TF_PPIs.gmt
VirusMINT.gmt
WikiPathways_pathways.gmt
Author(s)
Author: Neil Clark and Avi Ma'ayan
Maintainer: Neil R. Clark <neil.clark@mssm.edu>
References
Clark, Neil R., et al. "The characteristic direction: a geometrical approach to identify differentially expressed genes." BMC bioinformatics 15.1 (2014): 79.
Examples
##################################
#
# An example characteristic direction analysis
#
##################################
# Load the example data
data(example_expression_data)
data(example_sampleclass)
data(example_gammas)
# Examine the expression data
head(example_expression_data)
# Examine the corresponding sample class factor
example_sampleclass
# Run the analysis
chdir_analysis_example <- chdirAnalysis(example_expression_data,example_sampleclass,example_gammas
,CalculateSig=TRUE,nnull=10)
# Examine the results with the first value of the shrinkage parameter (gamma)
# show the first few of the most important genes.
lapply(chdir_analysis_example$results, function(x) x[1:10])
# We can also extract the results of the code{chdirSig} function
# for example chdir_analysis_example$chdirprops[[1]] gives the whole
# characteristic direction vector for each value of gamma:
lapply(chdir_analysis_example$chdirprops[[1]],head)
# and the estimated number of significant genes can be recovered with
chdir_analysis_example$chdirprops$number_sig_genes
##################################
#
# An example PAEA analysis
#
##################################
# Load the expression data
data(example_expression_data)
data(example_sampleclass)
data(example_gammas)
#load a gmt file
data(GeneOntology_BP.gmt)
# Run the characteristic direction analysis
chdir_analysis_example <- chdirAnalysis(example_expression_data,example_sampleclass,example_gammas
,CalculateSig=FALSE)
# Run the PAEA analysis
PAEAtest <- PAEAAnalysis(chdir_analysis_example$chdirprops, gmt[1:100], example_gammas)
# Examine the p values
PAEAtest$p_values
# Examine the principal angles
PAEAtest$principal_angles
##################################
#
# An example multigmtPAEA analysis
#
##################################
# Load the expression data
data(example_expression_data)
data(example_sampleclass)
data(example_gammas)
#load GMT file names
data(AllGMTfiles)
# Run the characteristic direction analysis
chdir_analysis_example <- chdirAnalysis(example_expression_data,example_sampleclass,example_gammas
,CalculateSig=FALSE)
# Run the PAEA analysis over the first two GMT files in the library
multiPAEAtest <- multigmtPAEAAnalysis(chdir_analysis_example$chdirprops, AllGMTfiles[2:3],
example_gammas)
# To run on all the gmt files
#multiPAEAtestAll <- multigmtPAEAAnalysis(chdir_analysis_example$chdirprops, gammas=example_gammas)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(GeoDE)
Loading required package: Matrix
Loading required package: MASS
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/GeoDE/GeoDE-package.Rd_%03d_medium.png", width=480, height=480)
> ### Name: GeoDE-package
> ### Title: Differential Expression and Enrichment Analysis with
> ### (Geo)metrical(D)ifferential(E)expression.
> ### Aliases: GeoDE-package GeoDE
> ### Keywords: package gene differential expression enrichmnet multivariate
> ### characteristic direction
>
> ### ** Examples
>
>
> ##################################
> #
> # An example characteristic direction analysis
> #
> ##################################
>
> # Load the example data
>
> data(example_expression_data)
> data(example_sampleclass)
> data(example_gammas)
>
> # Examine the expression data
> head(example_expression_data)
genenames Controll Controll.1 Controll.2 Pert. Pert..1 Pert..2
1 MTERFD2 138.64200 167.130000 156.199000 186.640000 122.005000 161.38200
2 SCRIB 52.65380 38.977800 68.963200 94.300900 60.634300 99.01180
3 ZXDC 59.37390 53.952500 55.103300 82.780500 52.770000 80.15000
4 MRPL32 333.80200 375.288000 475.200000 477.085000 327.193000 468.31600
5 WDR69 0.33557 0.614205 0.989874 0.421603 0.890432 1.05624
6 FOXL1 1.03177 0.901720 1.644250 1.170400 1.337110 1.31237
>
> # Examine the corresponding sample class factor
> example_sampleclass
[1] 1 1 1 2 2 2
Levels: 1 2
>
> # Run the analysis
> chdir_analysis_example <- chdirAnalysis(example_expression_data,example_sampleclass,example_gammas
+ ,CalculateSig=TRUE,nnull=10)
| | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100%
>
> # Examine the results with the first value of the shrinkage parameter (gamma)
>
> # show the first few of the most important genes.
>
> lapply(chdir_analysis_example$results, function(x) x[1:10])
[[1]]
MCL1 LIMD2 RPL27 MRPS18A TBL1X SOD1 DPP4
-0.6078472 0.3791251 -0.3477567 0.2718869 -0.2083958 0.1979625 0.1789876
NOX4 POLR2I ZDHHC20
0.1386403 -0.1214708 -0.1193214
>
> # We can also extract the results of the code{chdirSig} function
> # for example chdir_analysis_example$chdirprops[[1]] gives the whole
> # characteristic direction vector for each value of gamma:
>
> lapply(chdir_analysis_example$chdirprops[[1]],head)
[[1]]
1
MTERFD2 -0.0005105981
SCRIB 0.0148638842
ZXDC 0.0198058553
MRPL32 -0.0986935062
WDR69 -0.0002376169
FOXL1 -0.0006025896
>
> # and the estimated number of significant genes can be recovered with
>
> chdir_analysis_example$chdirprops$number_sig_genes
[[1]]
[1] 88
>
> ##################################
> #
> # An example PAEA analysis
> #
> ##################################
> # Load the expression data
>
> data(example_expression_data)
> data(example_sampleclass)
> data(example_gammas)
>
> #load a gmt file
> data(GeneOntology_BP.gmt)
>
> # Run the characteristic direction analysis
> chdir_analysis_example <- chdirAnalysis(example_expression_data,example_sampleclass,example_gammas
+ ,CalculateSig=FALSE)
>
> # Run the PAEA analysis
>
> PAEAtest <- PAEAAnalysis(chdir_analysis_example$chdirprops, gmt[1:100], example_gammas)
| | | 0% | |= | 1% | |= | 2% | |== | 3% | |=== | 4% | |==== | 5% | |==== | 6% | |===== | 7% | |====== | 8% | |====== | 9% | |======= | 10% | |======== | 11% | |======== | 12% | |========= | 13% | |========== | 14% | |========== | 15% | |=========== | 16% | |============ | 17% | |============= | 18% | |============= | 19% | |============== | 20% | |=============== | 21% | |=============== | 22% | |================ | 23% | |================= | 24% | |================== | 25% | |================== | 26% | |=================== | 27% | |==================== | 28% | |==================== | 29% | |===================== | 30% | |====================== | 31% | |====================== | 32% | |======================= | 33% | |======================== | 34% | |======================== | 35% | |========================= | 36% | |========================== | 37% | |=========================== | 38% | |=========================== | 39% | |============================ | 40% | |============================= | 41% | |============================= | 42% | |============================== | 43% | |=============================== | 44% | |================================ | 45% | |================================ | 46% | |================================= | 47% | |================================== | 48% | |================================== | 49% | |=================================== | 50% | |==================================== | 51% | |==================================== | 52% | |===================================== | 53% | |====================================== | 54% | |====================================== | 55% | |======================================= | 56% | |======================================== | 57% | |========================================= | 58% | |========================================= | 59% | |========================================== | 60% | |=========================================== | 61% | |=========================================== | 62% | |============================================ | 63% | |============================================= | 64% | |============================================== | 65% | |============================================== | 66% | |=============================================== | 67% | |================================================ | 68% | |================================================ | 69% | |================================================= | 70% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 73% | |==================================================== | 74% | |==================================================== | 75% | |===================================================== | 76% | |====================================================== | 77% | |======================================================= | 78% | |======================================================= | 79% | |======================================================== | 80% | |========================================================= | 81% | |========================================================= | 82% | |========================================================== | 83% | |=========================================================== | 84% | |============================================================ | 85% | |============================================================ | 86% | |============================================================= | 87% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 90% | |================================================================ | 91% | |================================================================ | 92% | |================================================================= | 93% | |================================================================== | 94% | |================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 97% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 100%
>
> # Examine the p values
>
> PAEAtest$p_values
translation (GO:0006412)
[1,] 0.01499437
RNA splicing, via transesterification reactions (GO:0000375)
[1,] 0.5233092
regulation of secretion (GO:0051046)
[1,] 0.6557127
positive regulation of secretion (GO:0051047)
[1,] 0.6557127
cellular amine metabolic process (GO:0009308)
[1,] 0.8229856
signal transduction (GO:0007165) cell communication (GO:0007154)
[1,] 0.9359704 0.9657269
protein complex assembly (GO:0006461) protein secretion (GO:0009306)
[1,] 0.9900633 1
rRNA transcription (GO:0009303)
[1,] 1
positive regulation of DNA replication (GO:0045740)
[1,] 1
respiratory burst (GO:0045730)
[1,] 1
positive regulation of protein catabolic process (GO:0045732)
[1,] 1
positive regulation of DNA repair (GO:0045739)
[1,] 1
negative regulation of adenylate cyclase activity (GO:0007194)
[1,] 1
inhibition of adenylate cyclase activity by G-protein signaling (GO:0007193)
[1,] 1
regulation of transcription factor activity (GO:0051090)
[1,] 1
activation of adenylate cyclase activity (GO:0007190)
[1,] 1
positive regulation of transcription factor activity (GO:0051091)
[1,] 1
positive regulation of NF-kappaB transcription factor activity (GO:0051092)
[1,] 1
response to radiation (GO:0009314)
[1,] 1
oligosaccharide metabolic process (GO:0009311)
[1,] 1
positive regulation of glycogen biosynthetic process (GO:0045725)
[1,] 1
positive regulation of tyrosine phosphorylation of Stat3 protein (GO:0042517)
[1,] 1
positive regulation of binding (GO:0051099)
[1,] 1
positive regulation of translation (GO:0045727)
[1,] 1
respiratory electron transport chain (GO:0022904)
[1,] 1
amine biosynthetic process (GO:0009309)
[1,] 1
response to hydrogen peroxide (GO:0042542)
[1,] 1
translational initiation (GO:0006413)
[1,] 1
translational elongation (GO:0006414) RNA export from nucleus (GO:0006405)
[1,] 1 1
mRNA export from nucleus (GO:0006406)
[1,] 1
positive regulation of tyrosine phosphorylation of STAT protein (GO:0042531)
[1,] 1
negative regulation of cell cycle (GO:0045786)
[1,] 1
ER-associated protein catabolic process (GO:0030433)
[1,] 1
mRNA catabolic process (GO:0006402)
[1,] 1
positive regulation of cell cycle (GO:0045787)
[1,] 1
tRNA modification (GO:0006400)
[1,] 1
positive regulation of cell adhesion (GO:0045785)
[1,] 1
RNA catabolic process (GO:0006401)
[1,] 1
response to reactive oxygen species (GO:0000302)
[1,] 1
negative regulation of secretion (GO:0051048)
[1,] 1
positive regulation of ossification (GO:0045778)
[1,] 1
negative regulation of transport (GO:0051051)
[1,] 1
superoxide anion generation (GO:0042554)
[1,] 1
regulation of angiogenesis (GO:0045765)
[1,] 1
positive regulation of adenylate cyclase activity (GO:0045762)
[1,] 1
positive regulation of anti-apoptosis (GO:0045768)
[1,] 1
positive regulation of angiogenesis (GO:0045766)
[1,] 1
regulation of anti-apoptosis (GO:0045767)
[1,] 1
regulation of S phase (GO:0033261)
[1,] 1
negative regulation of hormone secretion (GO:0046888)
[1,] 1
positive regulation of lipid biosynthetic process (GO:0046889)
[1,] 1
protein folding (GO:0006457)
[1,] 1
positive regulation of hormone secretion (GO:0046887)
[1,] 1
reciprocal meiotic recombination (GO:0007131)
[1,] 1
regulation of translational initiation (GO:0006446)
[1,] 1
regulation of translation (GO:0006417)
[1,] 1
tRNA aminoacylation for protein translation (GO:0006418)
[1,] 1
meiosis (GO:0007126) phospholipid catabolic process (GO:0009395)
[1,] 1 1
protein amino acid O-linked glycosylation (GO:0006493)
[1,] 1
epidermal growth factor receptor signaling pathway (GO:0007173)
[1,] 1
regulation of epidermal growth factor receptor activity (GO:0007176)
[1,] 1
protein amino acid lipidation (GO:0006497)
[1,] 1
transmembrane receptor protein serine/threonine kinase signaling pathway (GO:0007178)
[1,] 1
transforming growth factor beta receptor signaling pathway (GO:0007179)
[1,] 1
ATP metabolic process (GO:0046034)
[1,] 1
spliceosomal snRNP biogenesis (GO:0000387)
[1,] 1
endothelial cell migration (GO:0043542)
[1,] 1
G-protein signaling, coupled to cyclic nucleotide second messenger (GO:0007187)
[1,] 1
protein amino acid glycosylation (GO:0006486)
[1,] 1
G-protein signaling, coupled to cAMP nucleotide second messenger (GO:0007188)
[1,] 1
protein amino acid N-linked glycosylation (GO:0006487)
[1,] 1
protein amino acid ADP-ribosylation (GO:0006471)
[1,] 1
protein amino acid deph