Disease Atlas is not a single-score ranker. It is a map of disease biology that produces calibrated, uncertainty-aware predictions for every gene–disease pair across roughly 34,000 human diseases. Causal evidence, tractability, organ-resolved safety, and modality-specific design all run as separate pipelines, with disagreements between them surfaced rather than averaged away.
The page below explains the architecture visually in two minutes. The full methodology paper proves the system in thirty.
60 pages · published 2026 · references throughoutMost target-prioritisation tools count gene–disease associations from the literature. Disease Atlas asks a causal question instead: would perturbing this gene change the disease? The platform begins with the disease as a structured biological object, resolved at organ, tissue, and cell-type level, and ranks the full protein-coding genome against that resolved biology.
A target with strong causal biology can be untouchable in clinic if its class carries a defining safety pattern. A target with elegant small-molecule chemistry can have no biological reason to be in the disease. The pipelines run side by side so disagreements stay visible.
Every gene–disease pair carries a probability, not a rank percentile, at three distinct evidence thresholds. Probabilities are constrained to remain hierarchically coherent: a pair cannot be more likely to be approved than to be in development.
A target with high causal support and low clinical-advancement probability identifies biology the field has not yet developed. A target that scores well on all three sits in the evidence band where historically successful drug targets have lived.
The dashed band shows the uncertainty interval propagated from the per-disease confidence tier and evidence-source disagreement.
The platform models causal support, clinical advancement, and approval as three independent classifiers, each with its own calibration. Raw outputs pass through Platt scaling at the ranking layer; the translational layer applies branch-specific isotonic regression and tracks Brier and Expected Calibration Error after every training run.
| Threshold | Mean AUC | Mean AP | Class prevalence |
|---|---|---|---|
| Causal support | 0.947 | 0.890 | 9.4 × 10−3 |
| Clinical development | 0.917 | 0.041 | 3.7 × 10−5 |
| Approval | 0.885 | 0.0073 | 1.3 × 10−5 |
The cancer and non-cancer branches are calibrated separately because the underlying prevalence patterns differ systematically. Hierarchical coherence is enforced post-calibration.
A single confidence number tells the reader nothing about why a prediction is uncertain. Disease Atlas separates the three reasons a probability might sit in the middle of the scale, then exposes them on every assessment card.
Irreducible noise in the evidence itself. A pair whose calibrated probability sits near the middle carries more ambiguity than one near either tail. This is data uncertainty.
The genetic, network, expression, and transfer layers disagreeing with each other. Low when they converge, high when they diverge. The marker a single composite would hide.
Sensitivity to the specific training population. A prediction that holds across held-out folds is more trustworthy than one that swings. This is model uncertainty.
Roughly 15,000 diseases have rich genetic, perturbation, and single-cell evidence. The remaining long tail does not. The architecture is the same; the confidence with which any individual call should be taken is not.
Genes are ranked per cell type per disease, not shown alongside the cell types they happen to express in. The candidate that emerges at cell-type resolution is often not the candidate a literature-weighted ranking would put first.
The Disease Atlas cell-type score across the integrated single-cell atlases. Top hits are anatomically coherent with the underlying biology. The platform is not told this is a gut disease; the ranking reads the answer from the data.
Measured against the AACT clinical-trial database: 2,587 completed Phase 2 and Phase 3 trials, leakage-clean cohort, primary-endpoint success defined as p < 0.05 with effect direction matching the trial's stated hypothesis.
The cohort is restricted to trials that started before 2021 and whose targets could be mapped cleanly via a five-stage drug-to-target resolution. Every per-pipeline score contributes a signal that discriminates outcomes at Mann-Whitney p < 10−3; the integrated four-pipeline combination outperforms the strongest single score at DeLong p = 0.006.
The signal concentrates where the biology says it should: AUC 0.58–0.68 in Phase 1, 0.54–0.64 in Phase 2, and near 0.5 by Phase 3+ where failures become more idiosyncratic.
The recovery test asks whether the model finds correct answers in a space where the correct answers are already known. It does not establish prospective novel-target discovery at statistical scale, which the field cannot run at the timescale of any methodology paper.
The cohort base success rate is 66.1%, considerably higher than the field-wide rate, because pharma already filters target candidates on tractability and biology before initiating trials. AUC magnitudes are therefore conservative against balanced-cohort benchmarks.
One target. One disease. The four pipelines side by side, the disagreements surfaced, the confidence band exposed, and the cell-type context attached. Below is the assessment card for TNFSF15 in ulcerative colitis: the top-ranked target of 30,889 scored genes, currently in Phase 3 development.
Strong causal evidence (top 0% for this disease). Strongest support from genetic association: multiple genome-wide-significant loci, replicated in independent cohorts, with Mendelian-randomisation evidence linking expression to disease risk. Network propagation and cell-type expression context corroborate without driving the score. Foundation-model embedding contributes additional support.
Direct causal inference (MR). Per-disease confidence tier: A.
Secreted TNF-superfamily ligand. Architecture lands on antibody as the design the biology supports; comparison auditable.
TNFSF15 (the TL1A ligand) sits at rank 1 in both ulcerative colitis and Crohn's disease. The assessment is consistent with the front-running biology the development community is currently pursuing, including anti-TL1A programmes from multiple sponsors.
Every claim traces back to the primary record that produced it.
Disease-stratified five-fold cross-validation with leave-seed-out masking.
The training pipeline aborts on held-out divergence beyond a strict tolerance.
Platt and isotonic calibration; Brier and ECE tracked per training run.
The signal carries forward through every layer; nothing important is averaged away.
The whitepaper covers every pipeline, every reference, and the limitations explicitly. If you would like to see what the assessment view produces for one target in your therapeutic area, send us the gene and we will walk through it across the diseases where it scores.
Request a target walkthrough60 pages. Four pipelines. Every reference cited by PMID. Enter your details below and we will send the PDF directly.
We do not sell or share your address. One follow-up at most.