Bulk Data Downloads
Human Predictome dataset (Schmid et al., BioRxiv 2025)
Includes:
All human protein-protein interactions (~1.6 million pairs)
If you use this data for your research please cite: Proteome-wide in silico screening for human protein-protein interactions
Download instructions:
Smaller dataset (top 16K by SPOC score):
The smaller dataset contains the top 16,000 protein-protein interactions ranked by SPOC score.
Full dataset (all 1.6 million pairs):
The full dataset is compressed and split across 100 files. We recommend using wget or curl to download the dataset. The following manifest files contain a list of the URLs for each file and can be used with either tool.
Using wget (resumable):
wget -c -i manifest.txt
You can also use the same manifest with your tool of choice to download the dataset.
Predictomes paper associated data (Schmid and Walter, Mol Cell 2025)
Includes:
All genome maintanence pairs (~40,000 pairs)
3 proteome wide screens (DONSON, STK19, USP37) (~60,000 pairs)
All SPOC training/testing pairs (~50,000 pairs)
All 30 ranking experiment datasets (~30,000 pairs)
If you use this data for your research please cite: Predictomes, a classifier-curated database of AlphaFold-modeled protein-protein interactions