Gene Ontology: a shared biological vocabulary
Gene Ontology (GO) is a controlled, hierarchical vocabulary for describing gene and protein function, maintained by the Gene Ontology Consortium (est. 2000). Before GO, every lab used its own language for the same biology — making systematic comparisons impossible. GO provides ~43,000 standardised terms organised in a directed acyclic graph (DAG) structure, so a protein annotated to a specific term automatically inherits all broader terms above it in the hierarchy.
GO terms are divided into three namespaces. Understanding which namespace a result comes from is the first step in interpreting what your enrichment result means:
What is the protein doing?
A series of molecular events with a defined beginning and end — the biological "story" a group of proteins is involved in. Most enrichment analyses focus on BP terms since they describe the most immediately interpretable biology.
What activity does it have?
The biochemical activity of a single protein at the molecular level — describing capabilities, independent of context. Usually appears as a verb phrase ("kinase activity") rather than a process.
Where in the cell?
The physical location where a protein functions, or the complex to which it belongs. For PPI analysis, CC enrichment (e.g. "postsynaptic density") provides powerful contextual validation that your proteins make biological sense together.
Results like "biological regulation" or "metabolic process" are very broad GO parent terms that would be enriched in almost any gene list. Tools like Metascape automatically collapse these using semantic clustering. In g:Profiler, look for terms deep in the hierarchy — specific terms like "positive regulation of MAPK cascade" carry far more interpretive value than "signal transduction".
What enrichment analysis actually tests
Enrichment analysis answers: "Is the number of my input proteins annotated to GO term X greater than I would expect if I picked the same number of proteins at random from the genome?"
The most common test is the hypergeometric test (equivalent to Fisher's exact test). The logic:
N = genome size · K = GO term size · n = your gene list · k = observed overlap
Multiple testing correction is non-negotiable
Enrichment tests ~43,000 GO terms simultaneously. At p < 0.05 without correction, you'd get ~2,150 false positives. Tools handle this via:
Benjamini–Hochberg FDR
Controls the proportion of false positives among your significant results. FDR q < 0.05 = up to 5% of reported terms may be false positives. Standard for enrichment analysis. Used by g:Profiler (as q-value), Metascape, and most modern tools.
g:SCS (g:Profiler)
g:Profiler's custom correction method that accounts for the hierarchical structure of GO — child terms aren't truly independent of their parents. Slightly more permissive than BH for very specific terms, but more conservative for broad parent terms. Preferred when using g:Profiler.
Report the FDR q-value, not the raw p-value. Many published papers still report uncorrected enrichment p-values — reviewers increasingly flag this. State the correction method in your Methods: "GO enrichment was performed using g:Profiler with g:SCS correction, significance threshold q < 0.05."
g:Profiler: fast, transparent enrichment
g:Profiler (biit.cs.ut.ee/gprofiler, Raudvere et al., 2019) is a web tool and R/Python API for gene list enrichment analysis. It tests against GO, KEGG, Reactome, WikiPathways, Human Phenotype Ontology (HPO), and others in one run. It is the fastest and most statistically transparent of the commonly used enrichment tools.
Running a g:Profiler analysis on a STRING-derived gene list
Export gene list from STRING
In STRING after building your network: Exports → Protein names (TSV). This gives a plain-text gene symbol list. Also works with gene lists from differential expression, GWAS, or literature curation. g:Profiler accepts symbols, Ensembl IDs, and UniProt accessions.
Paste into g:GOSt
Go to biit.cs.ut.ee/gprofiler/gost. Paste gene symbols (one per line or space-separated) into the query box. For a multi-list comparison (e.g. hub proteins vs peripheral proteins), enter lists in separate boxes.
Configure settings
Organism: Homo sapiens. Significance threshold: g:SCS at 0.05. Under data sources, check: GO:BP, GO:MF, GO:CC, KEGG, Reactome, and (for neuro research) HP (Human Phenotype Ontology — directly maps enriched genes to disease phenotypes like "Parkinsonism" or "Amyloid deposits").
Read the Manhattan plot
Each dot = a significant term, plotted at −log10(q-value). X-axis groups by data source. Hover to see term name and contributing genes. Look for the highest dots within GO:BP — these are your most statistically robust biological process findings.
Export results
Download as CSV for supplementary data. Include columns: term name, GO ID, p-value, q-value (adjusted), term size, intersection size (overlap with your list), and the gene symbols in the intersection. All of this should appear in your paper's supplementary table.
Metascape: enrichment integrated with networks
Metascape (metascape.org, Zhou et al., 2019) goes beyond pure enrichment by combining GO/pathway analysis with PPI network construction from STRING, MCODE complex detection, and automatic semantic clustering. It is the most widely used tool in neuroscience PPI papers because it produces the complete figure panel — enrichment results, network, and functional clusters — in one automated pipeline.
Metascape pipeline — what happens step by step
Gene ID mapping
Metascape maps your input to Entrez Gene IDs. It handles synonyms (e.g. PARK2 and PRKN both map correctly). Unmapped genes are flagged — check these. If >20% of your list doesn't map, your gene symbols may use a different nomenclature system. Consider switching to Ensembl IDs.
Enrichment across 40+ databases
Metascape tests against GO (all three namespaces), KEGG, Reactome, WikiPathways, DisGeNET (disease gene associations), and more. Pre-filter: minimum 3 genes per term, p < 0.01 (uncorrected). Final correction: Benjamini–Hochberg. Only terms surviving both filters appear in results.
Hierarchical clustering of terms
Metascape clusters enriched terms by Kappa similarity (overlap of contributing genes). Related terms are grouped under a parent representative. This prevents your results from being overwhelmed by 80 nearly identical GO terms — instead you see ~8 clean functional themes. This is what gets plotted in the "enrichment network" figure commonly seen in papers.
PPI network + MCODE
Metascape queries STRING (confidence ≥ 0.7 by default) to build a PPI network of your input proteins. It then runs MCODE to identify densely connected modules. Each module is annotated with the GO terms most enriched in its members. This gives you: "Module 1: APP, PSEN1, BACE1 — enriched for amyloid precursor processing". Immediately interpretable and figure-ready.
Export to Cytoscape
Download the network as node/edge tables (CSV) or a Cytoscape session file. This is where Metascape hands off to Cytoscape — the automated clustering becomes a starting point for manually refined, publication-quality figures. Chapter 4 covers this handoff in detail.
g:Profiler vs Metascape: when to use which
Best for clean, transparent enrichment tables. Ideal for supplementary data in papers where statistical reproducibility is critical.
- Very fast (~5 seconds)
- Multiple annotation databases simultaneously
- Ordered gene list (GSEA-style) analysis
- R/Python API (gprofiler2) for reproducible scripts
- Fully documented statistical methods
- No built-in network visualisation
- No complex/module detection
Best for producing full figure panels. Enrichment + PPI network + MCODE clusters all in one pipeline, exportable to Cytoscape.
- Enrichment + network + MCODE in one tool
- Automatic semantic clustering of terms
- Multi-gene-list comparison analysis
- Direct Cytoscape export
- DisGeNET disease annotation built-in
- Less transparent statistics than g:Profiler
- Slower (several minutes for large lists)
- No programmatic API
Use g:Profiler for the formal supplementary enrichment table (reproducible, citable statistics) and Metascape to generate the main-text network/cluster figure. This gives reviewers the rigour of g:Profiler's documented methods, plus the visual appeal and integrative analysis of Metascape's pipeline.
Interpreting enrichment results: a worked example
Suppose you run Metascape on the Alzheimer's-relevant protein set: APP, PSEN1, BACE1, MAPT, CLU, BIN1, APOE, CDK5, GSK3B. Here are typical results and what they mean:
GO:0042982 · Biological Process
✅ Expected and validating. You specifically curated an APP-related protein set, so this result confirms your input is biologically coherent. The very low q-value and high gene ratio mean this enrichment is robust, not a fringe result.
GO:0043523 · Biological Process
✅ Biologically informative. This isn't obvious from looking at the individual proteins — it emerges from the network. Tau, GSK3β, CDK5, APOE, and APP are all connected to neuronal survival decisions. This is the kind of insight enrichment analysis is designed to surface.
GO:0065007 · Biological Process (broad parent)
⚠️ Uninformative broad term. Almost any gene list will enrich "biological regulation". Metascape's term clustering typically collapses these. In g:Profiler, this would appear but should be deprioritised in your interpretation. Don't include broad parent terms like this in your paper's discussion of findings.
GO:0006979 · Biological Process
🔍 Potentially interesting, warrants scrutiny. Only 3 genes drive this enrichment. Check which genes: CLU, APOE, and APP all have literature links to oxidative stress, so this is plausible. But the small overlap means this result is less robust. Report it as a trend, not a firm conclusion, and note the contributing genes explicitly.
Common misinterpretation to avoid
Enrichment analysis tells you what annotations are over-represented, not what the proteins necessarily do in your specific biological context. A set of synaptic proteins enriched for "positive regulation of cell proliferation" doesn't mean these proteins are proliferating — it means they share annotation overlap with proliferation-related GO terms, possibly because they regulate signalling pathways active in both contexts. Always interpret enrichment results in the context of your specific biological question and the existing literature.
Enrichment analysis is hypothesis-generating, not hypothesis-confirming. Finding that your AD protein set is enriched for "tau protein binding" (GO:0048156) is an interesting observation that directs further experimental investigation — it doesn't prove that all your proteins physically bind tau in vivo. The interpretation should be: "These results suggest that [processes] may be relevant, which warrants experimental validation." This framing is both accurate and acceptable to reviewers.
The hypergeometric test's p-value depends critically on the background set (N in the formula). If your input proteins came from an RNA-seq experiment on neurons, the appropriate background is "all genes expressed in neurons", not "all human genes". Using the wrong background inflates or deflates enrichment statistics. g:Profiler allows custom background upload. Always state the background used in your Methods section. This is one of the most common methodological flaws flagged by reviewers of bioinformatics studies.