R: Gene Ontology enrichment analysis
Gene Ontology enrichment analysis


Computes enrichment scores for Gene Ontology terms associated with genes in each topic.


compute.go.enrichment(lda.results, go.db, ontology.type = "BP",
  reformat.gene.names = FALSE, bonferroni.correct = TRUE,
  p.val.threshold = if (bonferroni.correct) 0.05 else 0.01,
  go.score.class = "weight01Score", dag.file.prefix = FALSE)



A fitted LDA model, as returned by compute.lda


String. Genome-wide annotation with GO mapping for the appropriate organism (e.g. or


(optional). “BP” for Biological Process, “MF” for Molecular Function, and “CC” for Cellular Component.


Boolean. If set to TRUE, converts all gene names to capitalised lowercase.


Boolean. Unless set to FALSE, adjust statistical testing p-value threshold for multiple testing.


Numeric (optional). P-value significance threshold.


String (optional). Name of the scoring method to use for the Kolmogorov-Smirnov test (e.g. “weigth01Score” or “elimScore”). See topGO documentation for a complete list of scoring methods.


String or FALSE. If not set to FALSE, plots individual subgraphs of significant terms for each topic using the string as filename prefix.


Returns a named list object with ranked tables of significantly enriched GO terms for each topic (‘all’), terms that only appear in each topic (‘unique’) and terms that appear in less than half of the other topics (‘rare’). In addition the list object contains an igraph object with the full GO DAG, annotated with each term's p-value and the significance threshold adjusted for multiple testing (Bonferroni method).


# Load pre-computed LDA model for skeletal myoblast RNA-Seq data from HSMMSingleCell package:

# Load GO mapping database for 'homo sapiens':
# Compute Cellular Component GO enrichment sets for each topic:
go.results = compute.go.enrichment(HSMM_lda_model,, ontology.type="CC", bonferroni.correct=TRUE, p.val.threshold=0.01)

# Print table of terms that are only significantly enriched in each topic: 


> # Load pre-computed LDA model for skeletal myoblast RNA-Seq data from HSMMSingleCell package:
> data(HSMM_lda_model)
> # Load GO mapping database for 'homo sapiens':
> library(

> # Compute Cellular Component GO enrichment sets for each topic:
> go.results = compute.go.enrichment(HSMM_lda_model,, ontology.type="CC", bonferroni.correct=TRUE, p.val.threshold=0.01)
> # Print table of terms that are only significantly enriched in each topic: 
> print(go.results$unique)
        GO.ID                                   Term Total p-Value
12 GO:0000777       condensed chromosome kinetochore    89 8.7e-11
16 GO:0005681                   spliceosomal complex   161 1.2e-08
26 GO:0000784   nuclear chromosome, telomeric region   102 1.1e-06
27 GO:0005813                             centrosome   406 1.4e-06
28 GO:0000922                           spindle pole   106 1.5e-06
31 GO:0046540           U4/U6 x U5 tri-snRNP complex    18 3.0e-06
32 GO:0005686                               U2 snRNP    17 3.1e-06
36 GO:0005689          U12-type spliceosomal complex    25 4.7e-06
38 GO:0000785                              chromatin   326 6.0e-06
39 GO:0005876                    spindle microtubule    50 7.7e-06
40 GO:0000940 condensed chromosome outer kinetochore    13 9.3e-06

[1] GO.ID   Term    Total   p-Value
<0 rows> (or 0-length row.names)

        GO.ID                             Term Total p-Value
23 GO:0030018                           Z disc    76 5.6e-07
24 GO:0001725                     stress fiber    39 7.9e-07
31 GO:0000932 cytoplasmic mRNA processing body    59 2.9e-06

        GO.ID              Term Total p-Value
13 GO:0005604 basement membrane    60 1.2e-05

        GO.ID                           Term Total p-Value
29 GO:0005761         mitochondrial ribosome    70 7.9e-06
33 GO:0000139                 Golgi membrane   467 1.1e-05
35 GO:0005789 endoplasmic reticulum membrane   649 1.2e-05
36 GO:0005885         Arp2/3 protein complex    10 1.3e-05
38 GO:0005739                  mitochondrion  1318 1.4e-05

