Methodology
AlphaFold-Multimer (AF-M) uses the same deep learning principles as AlphaFold to predict the structure of protein complexes. We installed the Colabfold version of AF-M locally and rented cloud-based GPUs to predict all binary protein-protein interactions (PPIs) among the core genome maintenance machinery. Each protein pair was folded in 3 out of the five independently-trained AF-M models with templates enabled. This pipeline generated an “all by all” matrix of potential PPIs. To save computing time, AF-M structures were not relaxed. Protein pairs that caused AF-M to exceed our available GPU capacity (longer than ~4000 residues total) were not folded (white squares in matrix).
To help assess whether an interaction is likely to be true, we provide standard AF-M confidence metrics (PAE, pLDDT, pDOCKQ), as well as another metric, the average number of AF-M models that agree on a prediction ("avg model #").
We are not uploading datasets that were not generated in-house, but we welcome suggestions for new proteins to fold.
For more detailed information about some of the in-house scripts we use to analyze AlphaFold multimer data please visit our lab's GitHub.
Limitations
Predicting interactions on a large scale inevitably yields 'false positives' and 'false negatives.' AF-M clearly generates false negatives because it fails to correctly predict some binary protein complexes reported in the PDB, particularly those with small interaction surfaces. On the other hand, false positives are suggested by the fact that folding proteins that reside in different cellular compartments yields many complexes with strong confidence metrics (data not shown). False positives are also frequently observed among paralogs (e.g. MCM2 and MCM7) that do not normally interact.
Until errors in structure-prediction are reduced or better-understood, we recommend focusing on interactions that have the strongest confidence values (apply stringent default filter in matrix) and are supported by independent evidence (X-linking MS, co-IP, genetic epistasis, co-dependency in DEPMAP etc.).
AF-M predictions alone are insufficient to provide evidence of interaction.
Attribution
This website, the high throughput AF-M folding pipeline, and the structure analysis piplines were created by Ernst Schmid in consultation with Johannes Walter.
Until this dataset is published in a scientific journal, please attribute PPIs found here to predictomes.org and cite the following publications for AF-M:
- Mirdita M, Schutze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. ColabFold: Making protein folding accessible to all. Nature Methods (2022) doi: 10.1038/s41592-022-01488-1
- Evans et al. "Protein complex prediction with AlphaFold-Multimer."biorxiv (2022) doi: 10.1101/2021.10.04.463034v1
Changelog
Date | Comment |
---|---|
12-04-2023 | We found an error in our original calculation of pDOCKQ. The calculation has been corrected to match https://doi.org/10.1038/s41467-022-28865-w, and all values on the site have been updated. The same error occurred in Lim et al. 2023, which will be corrected. We sincerely apologize for any inconvenience this may have caused. |
09-05-2023 | The site is officially released to the public. |