Chapter 8: AlphaFold & PPI

// 8.1

What AlphaFold actually is

AlphaFold is a deep learning system developed by Google DeepMind that predicts the 3D structure of a protein from its amino acid sequence alone. Before AlphaFold 2 (released 2021), predicting a protein's folded shape required years of experimental work — X-ray crystallography, cryo-electron microscopy, or NMR spectroscopy — all expensive, slow, and not always successful. AlphaFold changed this dramatically: in 2022, DeepMind and EMBL-EBI released predicted structures for virtually every known protein (~200 million structures in the AlphaFold Protein Structure Database), free to access.

The core output of an AlphaFold prediction is a PDB file — a 3D coordinate file describing where every atom in the protein is located — along with a per-residue confidence score called pLDDT (predicted Local Distance Difference Test, 0–100). Regions with pLDDT > 70 are considered reliably modelled; > 90 is high confidence; < 50 suggests an intrinsically disordered region where the prediction is less meaningful.

🔑

The key distinction: AlphaFold predicts structure, not function or interaction

AlphaFold tells you the shape of a single protein. It does not directly tell you which other proteins it binds to, where it is in the cell, or what it does. That is still the job of STRING, experimental assays, and network analysis. The integration between AlphaFold and PPI work is powerful but indirect — structure informs interaction confidence, interaction interface prediction, and drug target tractability. Understanding this boundary is the whole point of this chapter.

AlphaFold 2 vs AlphaFold 3 vs AlphaFold-Multimer

AlphaFold 2 (2021) modelled single protein chains. AlphaFold-Multimer extended this to predict how two or more proteins interact — it takes two (or more) sequences and predicts the complex structure, giving you an ipTM score (interface predicted TM-score, 0–1) as a measure of predicted interaction confidence. ipTM > 0.75 is considered a reliable predicted complex. AlphaFold 3 (2024) expanded further to model complexes involving DNA, RNA, small molecules, and post-translational modifications. For PPI work in 2024–2025, AlphaFold-Multimer is the most directly relevant tool.

// 8.2

Where AlphaFold fits in your PPI workflow

Think of the standard PPI workflow (GWAS → STRING network → enrichment → hub identification) as working at the systems level: which proteins interact with which, how they cluster into functional modules, which ones are hubs. AlphaFold operates at the molecular level: given that protein A and protein B interact, what does that interaction actually look like in 3D?

These two levels complement each other, and most sophisticated PPI studies now use both. Here is where AlphaFold becomes useful after your STRING/network analysis:

1

Validate that a predicted interaction is structurally plausible

STRING reports that proteins A and B interact with confidence 0.75 — but is that interaction physically possible? You can run AlphaFold-Multimer on the A+B sequence pair and check whether it predicts a stable complex (ipTM > 0.75, good PAE — Predicted Aligned Error — at the interface). If yes, this is supporting structural evidence for the STRING edge. If AlphaFold-Multimer predicts no stable interface, the STRING interaction may be indirect (they participate in the same pathway but don't physically touch) or a false positive. This is now a standard step in papers claiming a novel PPI.

2

Characterise the interaction interface of a hub protein

Your hub protein (say, DRD2 from the schizophrenia example) has high betweenness centrality — it connects multiple network modules. AlphaFold-Multimer can be used to model DRD2 in complex with each of its top interaction partners. This reveals whether DRD2 uses the same binding surface for all its interactions (making them competitive — a drug blocking one interaction may block all) or different surfaces (allowing selective modulation). This is directly actionable for drug development.

3

Assess druggability of hub proteins

A protein is "druggable" if it has a binding pocket — a cavity on its surface where a small molecule can fit and modulate activity. AlphaFold structure predictions, visualised in tools like Mol* or PyMOL, allow you to assess pocket geometry. Tools like FPocket, SiteMap, or DoGSiteScorer can take an AlphaFold PDB file and computationally identify putative binding pockets, with estimates of druggability. A hub protein with no predictable binding pocket is a poor drug target regardless of its network centrality.

4

Resolve disordered hub proteins — an important caveat

Many hub proteins in PPI networks are intrinsically disordered proteins (IDPs) — they lack a fixed 3D structure, which is partly why they are hubs (disorder enables binding to many partners with low specificity). AlphaFold will return a low-confidence structure for these regions (pLDDT < 50). If your hub comes back with large disordered regions, this is biologically informative: it suggests the protein likely operates via short linear motifs (SLiMs) that fold upon binding a partner, which is a different drugging challenge than a globular enzyme. Don't misinterpret low pLDDT as a poor prediction — it may accurately reflect that the protein is genuinely disordered.

5

Prioritise novel interaction predictions

After network analysis, you may identify two proteins that co-cluster in the same module, are both implicated by GWAS, but have no documented direct interaction in STRING (they're in the same module via shared neighbours). AlphaFold-Multimer can be run on this protein pair as a computational screen for a novel PPI. A high ipTM score becomes a testable hypothesis for a co-immunoprecipitation or proximity ligation assay in the lab. This is an increasingly common use case in computational neuroscience.

// 8.3

Practical tools: where to actually run this

You don't need to install anything locally to use AlphaFold. Here are the main resources:

Pre-computed AlphaFold 2 structures for ~200 million proteins. Search by protein name, UniProt ID, or gene name. Download the PDB file or view interactively in the browser with Mol* (the viewer built in). For any well-annotated human protein, start here — the structure is almost certainly already computed. No coding required. Each structure page also shows pLDDT per residue — check this before drawing conclusions about a region's structure.

The easiest way to run AlphaFold-Multimer for a protein complex — no local compute needed, runs on Google Colab (free GPU time). Input: two (or more) protein sequences, separated by a colon in the sequence field. Output: predicted complex structure + ipTM score + PAE (Predicted Aligned Error) plot. ipTM > 0.75 and low PAE at the interface = confident predicted complex. This is the tool to use to test whether two proteins from your network module are predicted to directly interact.

Browser-based 3D protein structure viewer — no download, no installation. Drag-and-drop a PDB file (from AlphaFold database) and visualise the protein in 3D. Colour by pLDDT (confidence) to immediately see which regions are reliably modelled. Identify surface pockets visually. For a beginner, this is the most accessible way to get a feel for what a protein structure looks like and where interactions might occur.

Takes a PDB file and identifies candidate binding pockets using Voronoi tessellation. Scores each pocket for druggability (volume, polarity, accessibility). Run on your hub protein's AlphaFold structure to assess whether it has a tractable binding site for small molecule drug development. A druggability score > 0.5 is generally considered promising. Free and widely used in academic drug discovery.

// 8.4

What AlphaFold does not do — important limitations

AlphaFold is genuinely transformative, but its limitations are frequently misunderstood. Being clear about these is important for interpreting results and impressing reviewers:

Question	AlphaFold can help?	Notes
What does this protein look like in 3D?	✓ Yes	Core use case; check pLDDT for confidence
Do proteins A and B directly interact?	△ Partially	AlphaFold-Multimer can predict this, but ipTM > 0.75 is supportive not definitive — needs experimental validation
How strong is the A–B interaction (affinity)?	✗ No	AlphaFold does not predict binding affinity (Kd). Separate tools (e.g. FoldX, Rosetta) needed, with significant uncertainty
Does the interaction happen in vivo in human brain cells?	✗ No	Structural plausibility ≠ physiological occurrence. Cell-type context, expression levels, post-translational modifications matter
What happens to the structure when this protein is mutated?	△ Partially	You can run AlphaFold on the mutant sequence, but AlphaFold is not optimised for predicting subtle mutant effects — FoldX or RosettaDDG are better
Which proteins interact with my hub in a network sense?	✗ No	This is STRING's job, not AlphaFold's. AlphaFold operates on sequences you give it — it doesn't search for interaction partners
What does a dynamic/disordered protein look like?	△ Limited	Low pLDDT regions are disordered. AlphaFold may reflect genuine disorder, but it cannot capture conformational dynamics (use MD simulations for that)

// 8.5

An integrated example: SCZ hub → AlphaFold → drug pocket

Putting it all together using the schizophrenia example from Chapter 5. Your STRING network identified GRIN2B (GluN2B, an NMDA receptor subunit) as a high-degree hub in the glutamate module, with betweenness centrality placing it as one of the top-5 hubs across the whole SCZ GWAS network. How does AlphaFold fit in here?

1

Retrieve the AlphaFold structure of GRIN2B

Go to alphafold.ebi.ac.uk → search "GRIN2B" → select human (UniProt Q13224). The structure viewer opens showing the ~1,461 amino acid protein. Note immediately: the large N-terminal domain (ATD, amino acids ~30–400) and the transmembrane domain (~550–800) show high pLDDT (>85) — well-folded, reliable. The intracellular C-terminal tail (~900–1461) shows pLDDT < 50 — intrinsically disordered. This C-terminal tail is where the vast majority of GRIN2B's PPI partners bind (PSD-95, SHANK, CaMKII) — this is structurally why GRIN2B is such a hub: a long disordered tail enables promiscuous binding.

2

Run AlphaFold-Multimer: GRIN2B ATD + ifenprodil (via ColabFold)

The ATD (amino-terminal domain) of GRIN2B contains the binding site for ifenprodil — a subunit-selective NMDA antagonist that was trialled for SCZ and neuropathic pain. Run AlphaFold-Multimer (ColabFold) on the GRIN2B ATD sequence + the GluN1 ATD sequence (GRIN1, GRIN2B's obligatory heterodimerisation partner). The predicted complex shows the two ATDs forming the characteristic "clam-shell" dimer — consistent with solved crystal structures, validating the approach. The ifenprodil-binding cleft is visible at the dimer interface.

3

Run FPocket on the GRIN2B ATD structure

Upload the GRIN2B ATD PDB from AlphaFold to FPocket (or an equivalent web server). FPocket identifies 3 candidate pockets; the top pocket (druggability score ~0.71) corresponds precisely to the ifenprodil binding cleft at the ATD dimer interface. This is structural confirmation that GRIN2B is not just a network hub, but is structurally druggable at a known, validated site. This kind of structure-informed druggability assessment is now expected in computational drug discovery papers.

4

What you'd write in a paper

"Network analysis of SCZ GWAS-implicated proteins identified GRIN2B as a high-centrality hub (betweenness rank: top 5) in the glutamate receptor signalling module. Structural assessment of GRIN2B using the AlphaFold v2 predicted structure (pLDDT > 85 for the ATD; UniProt Q13224) and pocket detection (FPocket v3.0; top pocket druggability score = 0.71) confirmed structural druggability at the ATD dimer interface, consistent with the known ifenprodil binding site (Bhatt et al., 2017). These converging lines of computational evidence support GRIN2B as a high-priority drug target candidate for SCZ pharmacotherapy."

// 8.6

Summary: AlphaFold's role in the PPI toolkit

AlphaFold does not replace STRING, Cytoscape, g:Profiler, or Metascape — it operates at a different level of analysis and the tools are complementary, not competing. The sensible workflow is:

Systems level first (STRING → network → MCODE → enrichment → hub identification) to determine which proteins matter and how they relate to each other. Then molecular level second (AlphaFold → structure review → AlphaFold-Multimer for complex prediction → FPocket for druggability) to understand what the interactions look like and whether they're actionable. AlphaFold gives your hub protein candidates structural credibility and drug development context. Without the network analysis upstream, you wouldn't know which of the ~20,000 human proteins to bother running AlphaFold on.

💡

The one-sentence version for your professor

AlphaFold tells you the shape of each protein your network analysis has flagged as important — this confirms whether a predicted interaction is structurally plausible, identifies where on the protein drugs could bind, and can computationally test novel interaction hypotheses that the network suggests but databases haven't confirmed yet.