Bulk Data Downloads
Human Predictome dataset (Schmid et al., BioRxiv 2025)
Includes:
All human protein-protein interactions (~1.6 million pairs)
If you use this data for your research please cite: Proteome-wide in silico screening for human protein-protein interactions
Download instructions:
Smaller dataset (top 16K by SPOC score):
The smaller dataset contains the top 16,000 protein-protein interactions ranked by SPOC score.
Download top 16K dataset (53.21 GB)
Pair scores file:
The pair scores file contains the SPOC scores for all protein-protein interactions in the dataset.
Download pair scores (28.4 MB compressed, 116 MB uncompressed)
Full dataset (all 1.6 million pairs, ~4.32 TB):
The full dataset is compressed and split across 100 files. We recommend using wget or curl to download the dataset. The following manifest files contain a list of the URLs for each file and can be used with either tool.
Using wget (resumable):
wget -c -i manifest.txt
You can also use the same manifest with your tool of choice to download the dataset.
Individual dataset file information:
| File | Size | Index (list of protein pairs in the file) |
|---|---|---|
| data_split_00.tar | 38.0 GB | index_data_split_00.csv |
| data_split_01.tar | 41.4 GB | index_data_split_01.csv |
| data_split_02.tar | 37.9 GB | index_data_split_02.csv |
| data_split_03.tar | 39.9 GB | index_data_split_03.csv |
| data_split_04.tar | 37.7 GB | index_data_split_04.csv |
| data_split_05.tar | 36.3 GB | index_data_split_05.csv |
| data_split_06.tar | 40.7 GB | index_data_split_06.csv |
| data_split_07.tar | 41.1 GB | index_data_split_07.csv |
| data_split_08.tar | 39.0 GB | index_data_split_08.csv |
| data_split_09.tar | 38.3 GB | index_data_split_09.csv |
| data_split_10.tar | 47.2 GB | index_data_split_10.csv |
| data_split_11.tar | 51.0 GB | index_data_split_11.csv |
| data_split_12.tar | 47.5 GB | index_data_split_12.csv |
| data_split_13.tar | 49.6 GB | index_data_split_13.csv |
| data_split_14.tar | 45.9 GB | index_data_split_14.csv |
| data_split_15.tar | 44.7 GB | index_data_split_15.csv |
| data_split_16.tar | 50.7 GB | index_data_split_16.csv |
| data_split_17.tar | 51.0 GB | index_data_split_17.csv |
| data_split_18.tar | 48.6 GB | index_data_split_18.csv |
| data_split_19.tar | 47.6 GB | index_data_split_19.csv |
| data_split_20.tar | 40.2 GB | index_data_split_20.csv |
| data_split_21.tar | 43.6 GB | index_data_split_21.csv |
| data_split_22.tar | 40.6 GB | index_data_split_22.csv |
| data_split_23.tar | 42.9 GB | index_data_split_23.csv |
| data_split_24.tar | 40.1 GB | index_data_split_24.csv |
| data_split_25.tar | 38.3 GB | index_data_split_25.csv |
| data_split_26.tar | 43.9 GB | index_data_split_26.csv |
| data_split_27.tar | 43.5 GB | index_data_split_27.csv |
| data_split_28.tar | 41.7 GB | index_data_split_28.csv |
| data_split_29.tar | 40.9 GB | index_data_split_29.csv |
| data_split_30.tar | 43.0 GB | index_data_split_30.csv |
| data_split_31.tar | 46.9 GB | index_data_split_31.csv |
| data_split_32.tar | 43.1 GB | index_data_split_32.csv |
| data_split_33.tar | 45.7 GB | index_data_split_33.csv |
| data_split_34.tar | 43.0 GB | index_data_split_34.csv |
| data_split_35.tar | 41.4 GB | index_data_split_35.csv |
| data_split_36.tar | 46.4 GB | index_data_split_36.csv |
| data_split_37.tar | 46.7 GB | index_data_split_37.csv |
| data_split_38.tar | 45.0 GB | index_data_split_38.csv |
| data_split_39.tar | 42.7 GB | index_data_split_39.csv |
| data_split_40.tar | 38.7 GB | index_data_split_40.csv |
| data_split_41.tar | 42.2 GB | index_data_split_41.csv |
| data_split_42.tar | 39.1 GB | index_data_split_42.csv |
| data_split_43.tar | 41.6 GB | index_data_split_43.csv |
| data_split_44.tar | 38.6 GB | index_data_split_44.csv |
| data_split_45.tar | 37.8 GB | index_data_split_45.csv |
| data_split_46.tar | 41.9 GB | index_data_split_46.csv |
| data_split_47.tar | 42.6 GB | index_data_split_47.csv |
| data_split_48.tar | 40.5 GB | index_data_split_48.csv |
| data_split_49.tar | 39.2 GB | index_data_split_49.csv |
| data_split_50.tar | 37.0 GB | index_data_split_50.csv |
| data_split_51.tar | 40.2 GB | index_data_split_51.csv |
| data_split_52.tar | 37.1 GB | index_data_split_52.csv |
| data_split_53.tar | 39.2 GB | index_data_split_53.csv |
| data_split_54.tar | 37.2 GB | index_data_split_54.csv |
| data_split_55.tar | 35.9 GB | index_data_split_55.csv |
| data_split_56.tar | 39.8 GB | index_data_split_56.csv |
| data_split_57.tar | 40.5 GB | index_data_split_57.csv |
| data_split_58.tar | 38.8 GB | index_data_split_58.csv |
| data_split_59.tar | 37.5 GB | index_data_split_59.csv |
| data_split_60.tar | 44.4 GB | index_data_split_60.csv |
| data_split_61.tar | 48.1 GB | index_data_split_61.csv |
| data_split_62.tar | 44.4 GB | index_data_split_62.csv |
| data_split_63.tar | 47.2 GB | index_data_split_63.csv |
| data_split_64.tar | 43.3 GB | index_data_split_64.csv |
| data_split_65.tar | 41.9 GB | index_data_split_65.csv |
| data_split_66.tar | 47.0 GB | index_data_split_66.csv |
| data_split_67.tar | 48.2 GB | index_data_split_67.csv |
| data_split_68.tar | 46.0 GB | index_data_split_68.csv |
| data_split_69.tar | 45.3 GB | index_data_split_69.csv |
| data_split_70.tar | 44.0 GB | index_data_split_70.csv |
| data_split_71.tar | 48.1 GB | index_data_split_71.csv |
| data_split_72.tar | 44.9 GB | index_data_split_72.csv |
| data_split_73.tar | 46.8 GB | index_data_split_73.csv |
| data_split_74.tar | 42.7 GB | index_data_split_74.csv |
| data_split_75.tar | 42.7 GB | index_data_split_75.csv |
| data_split_76.tar | 47.5 GB | index_data_split_76.csv |
| data_split_77.tar | 47.3 GB | index_data_split_77.csv |
| data_split_78.tar | 46.7 GB | index_data_split_78.csv |
| data_split_79.tar | 44.7 GB | index_data_split_79.csv |
| data_split_80.tar | 45.9 GB | index_data_split_80.csv |
| data_split_81.tar | 48.6 GB | index_data_split_81.csv |
| data_split_82.tar | 45.1 GB | index_data_split_82.csv |
| data_split_83.tar | 46.8 GB | index_data_split_83.csv |
| data_split_84.tar | 43.9 GB | index_data_split_84.csv |
| data_split_85.tar | 42.8 GB | index_data_split_85.csv |
| data_split_86.tar | 47.8 GB | index_data_split_86.csv |
| data_split_87.tar | 48.4 GB | index_data_split_87.csv |
| data_split_88.tar | 47.8 GB | index_data_split_88.csv |
| data_split_89.tar | 45.2 GB | index_data_split_89.csv |
| data_split_90.tar | 41.8 GB | index_data_split_90.csv |
| data_split_91.tar | 45.3 GB | index_data_split_91.csv |
| data_split_92.tar | 42.4 GB | index_data_split_92.csv |
| data_split_93.tar | 44.1 GB | index_data_split_93.csv |
| data_split_94.tar | 40.9 GB | index_data_split_94.csv |
| data_split_95.tar | 40.0 GB | index_data_split_95.csv |
| data_split_96.tar | 44.4 GB | index_data_split_96.csv |
| data_split_97.tar | 45.0 GB | index_data_split_97.csv |
| data_split_98.tar | 43.1 GB | index_data_split_98.csv |
| data_split_99.tar | 41.6 GB | index_data_split_99.csv |
Predictomes paper associated data (Schmid and Walter, Mol Cell 2025)
Includes:
All genome maintanence pairs (~40,000 pairs)
3 proteome wide screens (DONSON, STK19, USP37) (~60,000 pairs)
All SPOC training/testing pairs (~50,000 pairs)
All 30 ranking experiment datasets (~30,000 pairs)
If you use this data for your research please cite: Predictomes, a classifier-curated database of AlphaFold-modeled protein-protein interactions