Instructional Video
Screening pipeline
1. The interaction likelihood for every pair of human proteins (~200 million possible combinations) was determined using a classifier called KIRC, which leverages biological information from omics databases such as DepMap and BioGRID.
2. The 1.6 million pairs with the highest KIRC scores were modeled using AlphaFold-multimer (AF-M, see below), and the confidence of each prediction was evaluated with another classifier (SPOC), which assesses structural and biological features of the prediction (see below).
The screen identified ~16,000 high confidence (90% precision) or ~112,000 lower confidence (50% precision) PPIs.
AlphaFold-Multimer (AF-M)
AF-M is a deep learning algorithm that was trained to predict the structure of protein complexes. We used the Colabfold version of AF-M to predict binary protein-protein interactions (PPIs). Each protein pair was folded in 3 out of the 5 uniquely-trained AF-M models with templates enabled. To save computing time, AF-M structures were not relaxed, and protein pairs that exceeded ~3600 residues total were excluded.
Estimating confidence
To help assess whether an interaction is likely to be true, we trained a classifier called SPOC (Structure Prediction and Omics Classifier; 0-1 scale). Each SPOC score is associated with a False Discovery Rate (FDR, available in table of hits), which indicates the percentage of interactions at a given SPOC score that are false in proteome-wide screens. When screening a smaller group of proteins that should be enriched for real interactors (e.g. IP-mass spectrometry hits), lower SPOC scores with higher FDRs can be tolerated (see Figure 3F in Schmid and Walter).
In addition, we provide standard AF-M confidence metrics (PAE, pLDDT, pDOCKQ), as well as another metric, the average number of AF-M models that agree on an interface prediction ("avg_models").
Limitations
1. A high SPOC score is never definitive evidence of interaction. One instance where high SPOC scores can be especially misleading is among paralogs. For example, SPOC gives similarly high scores to pairs of MCM subunits that interact within the hetero-hexameric MCM2-7 helicase as pairs that do not. We recommend prioritizing PPIs with the highest SPOC scores, and/or ones that are supported by independent evidence. We also recommend assessing whether a newly predicted interface clashes with other, constitutive interactions made by either protein in the pair.
2. Some proteins are computationally "sticky." A few proteins are predicted to have many partners. While these can be physiological (e.g. PCNA has many interactors), in some cases they appear to interact promiscuously with the bait (e.g. via a coiled-coil).
3. False negatives. Considering the high KIRC and SPOC thresholds applied, and the fact that AlphaFold-Multimer fails to predict 60% of known complexes in our implementation, our pipeline will miss many true interactions. Therefore, the predictome is incomplete, and a low SPOC score does not mean that two proteins do not interact.
4. Residue positioning. The structures in this database have not been relaxed. Therefore, side chain positioning may be non-optimal. We recommend relaxing structures with AMBER before using them to guide mutagenesis studies or other detailed mechanistic analyses.
Attribution and Code
This website was created by Ernst Schmid in collaboration with Johannes Walter.
For in-house scripts used to analyze AlphaFold multimer data, visit our lab's GitHub.
For SPOC and PPIs found on predictomes.org, please cite: Schmid and Walter and Schmid et al.
For AF-M, please cite:
- Mirdita M, Schutze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. ColabFold: Making protein folding accessible to all. Nature Methods (2022) doi: 10.1038/s41592-022-01488-1
- Evans et al. "Protein complex prediction with AlphaFold-Multimer."biorxiv (2022) doi: 10.1101/2021.10.04.463034v1
Changelog
| Date | Comment |
|---|---|
| 11-07-2025 | Predictomes.org migrates to AWS, and 1.6M pairs in the human predictome are loaded. |
| 02-25-2025 | The online SPOC analysis tool has now been updated to report SPOC rather than cSPOC scores. |
| 02-17-2025 | SPOC scores have been updated and now reflect the most curent version of SPOC. |
| 12-04-2023 | We found an error in our original calculation of pDOCKQ. The calculation has been corrected to match https://doi.org/10.1038/s41467-022-28865-w, and all values on the site have been updated. We sincerely apologize for any inconvenience this may have caused. |
| 09-05-2023 | The site is officially released to the public. |