Chapter 1: PPI Fundamentals

// section 1.1

What exactly is a protein–protein interaction?

A protein–protein interaction (PPI) is a physical contact between two or more protein molecules that leads to a biological outcome — a change in the activity, localisation, stability, or signalling behaviour of one or both proteins. This contact is mediated by specific structural surfaces on each protein, often referred to as the binding interface.

The key word here is specific. Proteins do not stick to everything they encounter. Their surfaces have evolved complementary shapes and charge distributions so that only the right partner fits, like a lock accepting only one key. This specificity is what makes PPIs information-carrying: the fact that protein A binds protein B, but not protein C, is itself a meaningful biological signal.

💡

Why "interaction" and not just "binding"?

Binding implies a static lock-and-key relationship. "Interaction" better captures the dynamic reality: many PPIs are fleeting (transient), regulated, and occur only in specific cellular contexts — for example, only after a protein has been phosphorylated, or only in one cellular compartment.

The three main interaction types

Not all PPIs are the same. Understanding the distinction is important for interpreting what databases like STRING report and what experimental methods can detect:

Direct (binary) interaction

Two proteins physically contact each other. Detected by yeast two-hybrid (Y2H), co-immunoprecipitation, or fluorescence resonance energy transfer (FRET). STRING reports these under the "experimentally determined" evidence channel. Example: SNCA directly binds PARK7 (DJ-1).

Co-complex (indirect) interaction

Two proteins are found in the same macromolecular complex but may not directly touch. Detected by AP-MS. Example: APP and nicastrin are both in the γ-secretase complex but don't directly contact each other.

Transient vs. stable interactions

Stable interactions persist (e.g. ribosome subunits). Transient interactions are brief, often regulated by post-translational modifications: a kinase briefly docks onto its substrate, phosphorylates it, then dissociates. Most signalling PPIs are transient — making them harder to detect experimentally but functionally critical.

// section 1.2

From interactions to networks

If you study just one PPI in isolation, you get limited information. The real power comes from studying all the interactions together, as a network (also called a protein interaction network or "interactome graph"). In network terms, proteins are nodes and their interactions are edges.

A network reveals things that single interactions cannot: which proteins are most connected (hubs), which proteins bridge different functional groups (bottlenecks), and how information flows from a receptor at the cell surface to a transcription factor in the nucleus.

Interactive PPI network — hover over nodes to explore

Drag nodes to rearrange

🔑

Reading the network above

Larger nodes = more interaction partners (higher degree). Brighter edges = higher confidence interactions (more experimental evidence). Clusters = proteins that tend to co-function. Try dragging a hub node — notice how much of the network moves with it.

What is a "scale-free" network, and is the interactome one?

In a random network, every node has roughly the same number of connections. But biological PPI networks follow a very different pattern, called a scale-free distribution (or power-law distribution): most proteins have very few interaction partners, while a small number of hub proteins have extremely many. This creates a characteristic "long tail" when you plot it:

Typical PPI network degree distribution

This shape has profound consequences. Because hubs are so highly connected, removing them (through mutation or drug treatment) can collapse large sections of the network. This is why many disease-causing mutations hit hub proteins — and why bioinformaticians specifically look for hubs when analysing a PPI network. It also explains why some proteins (like TP53, MAPT, or PSD-95) appear in so many different diseases.

⚠️

Important caveat on "scale-free"

The scale-free model is a useful approximation, but has been debated in recent literature (Broido & Clauset, 2019). Many observed high-degree hubs may partly reflect study bias — well-studied proteins attract more interaction screens. Keep this in mind when interpreting hub analyses from databases like STRING, which aggregate published literature.

// section 1.3

Hub proteins: the airports of the interactome

A hub protein is formally defined as a node with a degree (number of interaction partners) significantly higher than the network average. But beyond the definition, hub proteins represent something biologically important: they are physical meeting points for multiple signalling pathways.

Date hubs

Interact with many partners, but mostly one at a time. Their binding interfaces are context-dependent — they adopt different conformations depending on the current partner. Common in regulatory signalling (e.g. kinase substrates). Also called "sequential" hubs.

Party hubs

Interact with many partners simultaneously, often as part of a large stable complex. Tend to be highly expressed and broadly present across tissues. Disrupting them typically has severe, pleiotropic effects. Examples include ribosomal proteins and core chaperones like HSP90.

In neuroscience, several hub proteins come up repeatedly in PPI analyses. Understanding why requires knowing a bit about their biology:

APP (Amyloid Precursor Protein) — the Alzheimer's hub

APP is a transmembrane protein whose sequential cleavage by BACE1 (β-secretase) and the γ-secretase complex (containing PSEN1, PSEN2, nicastrin, APH-1) produces amyloid-β peptides. In STRING analysis, APP consistently appears as a high-degree hub because it physically interacts with the secretases that process it, with adaptor proteins like FE65, and with extracellular matrix components. The network around APP has been instrumental in identifying potential AD drug targets — every direct interactor of APP is a candidate for modulating Aβ production.

📄

In practice

When you query APP in STRING with a confidence threshold of 0.7 and limit to 20 interactors, you will reliably recover PSEN1, BACE1, APOE, CLU, and BIN1 — the core AD-relevant interactome. This is exactly the kind of result you'll see in the worked examples in Chapter 5.

MAPT (Tau) — the cytoskeletal hub

MAPT encodes tau, a microtubule-associated protein that stabilises the neuronal cytoskeleton. In Alzheimer's and other tauopathies, tau becomes abnormally phosphorylated (by CDK5 and GSK3β) and detaches from microtubules to form neurofibrillary tangles. The PPI network around MAPT is therefore highly relevant to disease: CDK5, GSK3β, DYRK1A, and PP2A all interact with tau and modulate its phosphorylation state. GO enrichment of MAPT's first-order interactors consistently returns terms related to "microtubule organisation", "protein phosphorylation", and "axon guidance".

SNCA (α-Synuclein) — the Parkinson's hub

α-Synuclein is a presynaptic protein implicated in dopamine neurotransmission. Its tendency to misfold and aggregate into Lewy bodies is the hallmark of Parkinson's disease. In STRING, SNCA forms a tight cluster with PARK2 (Parkin), PINK1, and LRRK2 — all PD-causative genes that converge on mitochondrial quality control and the ubiquitin-proteasome system. This convergence was revealed partly through PPI network analysis, helping establish that PD may fundamentally be a disease of impaired protein degradation rather than solely a protein aggregation disorder.

PSD-95 (DLG4) — the synaptic scaffold hub

PSD-95 (encoded by DLG4) is a scaffolding protein at the postsynaptic density of excitatory synapses. It contains multiple PDZ domains — protein interaction modules that bind to the C-terminal tails of NMDA receptors, AMPA receptor trafficking proteins, and numerous signalling enzymes. With over 100 documented interactors, PSD-95 is one of the highest-degree hub proteins in the brain interactome. Mutations affecting its interaction network contribute to autism, schizophrenia, and intellectual disability — highlighting how a single hub can be relevant to diverse neuropsychiatric conditions.

// section 1.4

Key network metrics you need to know

When you run a network analysis — whether in Metascape, Cytoscape, or STRING — you'll encounter several quantitative measures. Click on each below to understand what it measures and when it matters:

Degree

Count of direct interaction partners.

Betweenness centrality

How often a protein lies on shortest paths between others.

Clustering coefficient

How interconnected a protein's neighbours are.

Eigenvector centrality

Influence weighted by the importance of neighbours.

← Select a metric to learn more

🔑

Which metric to use?

For identifying disease-relevant proteins: degree finds functional hubs. Betweenness finds cross-pathway bottlenecks that are excellent drug target candidates. Clustering coefficient helps identify protein complexes. Metascape and Cytoscape's NetworkAnalyzer plugin compute all of these for your network automatically.

// section 1.5

Why PPIs are particularly important in neuroscience

The nervous system is unique among tissues for several reasons that make PPI analysis especially powerful and especially necessary:

🧠 Extreme molecular complexity

Neurons express more genes than almost any other cell type. The postsynaptic density alone contains over 1,000 proteins. Interactions between them cannot be studied one at a time — network approaches are the only practical way to handle this scale.

⏱️ Rapid, dynamic signalling

Synaptic plasticity, learning, and memory all depend on rapidly shifting PPI landscapes — proteins associate and dissociate on millisecond to minute timescales. Understanding these dynamics requires knowing which interactions are possible, which is where the interactome map comes in.

🔬 Genetic disease convergence

GWAS and sequencing studies have identified hundreds of genes associated with AD, PD, ASD, and schizophrenia. By themselves these gene lists are hard to interpret. PPI network analysis reveals that seemingly unrelated disease genes often encode proteins in the same interaction network — pointing towards shared pathological mechanisms.

💊 Drug target identification

Most current neurological drugs target receptors or enzymes directly. But many researchers now argue that disrupting specific PPIs — rather than blocking whole protein activities — could be more selective. Identifying and targeting PPIs requires knowing the interactome first.

"Mapping the human brain interactome is not merely an academic exercise — it is a prerequisite for understanding how neurological diseases arise and how we might treat them."