Tool   Benchmark

Benchmark

Cluster-split evaluation of the currently deployed allosteric-site model on 226 held-out proteins — none with more than 30% sequence identity to the training set. No cherry-picking, no per-protein tuning, no hidden fallbacks. The model version, training data, and this page's source JSON are all linked at the bottom.

Methodology

Task: predict which residues of a protein form an allosteric binding site.
Test set: data/mega/split_test.csv — 226 entries from AlloBench + Allosteric DB, each clustered at MMseqs2 <30% identity to every training entry.
Scoring: for each test protein we call the live /predict endpoint, take the per-pocket allosteric probability, project it to every residue it contains. Residues not in any of the top-5 predicted pockets get probability 0. Ground truth per residue comes from the curated allo_residues list in each dataset row. We then aggregate residue-level predictions + labels across all test proteins and compute ROC-AUC, PR-AUC, F1, best-F1, precision, and recall.
Comparison baselines: PASSer 2.0 and VN-EGNN report on overlapping but not identical test sets. Their cited numbers are shown for scale; the only apples-to-apples number is our own, which you can reproduce by running /opt/allopath/benchmark_harness.py against this same split.

Comparison

ModelROC-AUCPR-AUCF1 (best)Test set
loak allonet GBT v1226 cluster-splitours
PASSer 2.0 (2023)~0.80PASSer datasetcited
VN-EGNN (2024)~0.82COACH420 + PDBbindcited

Per-protein AUC distribution

Loading…