Drug-discovery programs spend most of their analytical effort on target discovery: given a disease, find the gene. Indication selection is the inverse question: given a gene with established biology, find the patient population that gives the program the best chance of clinical success.
The economic case for asking the inverse question well is strong. Indication selection determines which trial gets run, which patient subgroup is enrolled, which endpoint is measured, and which competitive context the program enters. Get it right, and a candidate molecule may have multiple plausible diseases to target across its development life. Get it wrong, and a strong target hypothesis runs into the wrong patient population and the program reads out flat.
The Euretos AI Platform’s Indication Selection capability has been part of the platform since its early releases. This post is a methodological walk through what it does, how the underlying disease models support it, and how a researcher uses the output.
Indication selection without integrated data is a literature-search problem with an unbounded answer space. Every gene of interest will have published associations to dozens or hundreds of diseases, ranging from established mechanism to spurious co-mention. Sorting that signal manually is the bottleneck.
The platform’s Indication Selection view sits on top of the same integrated knowledge graph the rest of the AI Platform uses. The knowledge graph carries gene-disease associations from genetic-association databases (GWAS catalogues, ClinVar, OMIM, Orphanet), from perturbation evidence (model-organism studies, knockout phenotypes), from expression evidence (across tissues and, where atlases are available, across cell types), and from literature-derived associations indexed under common ontology terms.
For a query target, the Indication Selection capability returns a ranked landscape of diseases with weighted multi-source evidence per disease. The ranking is not a single literature count; it is an aggregation of independent evidence streams, each of which contributes to the score with different weights depending on the strength and specificity of the evidence type.
Over the years of running Indication Selection queries with translational research customers, three patterns recur often enough to call out.
Pattern 1: cross-tissue same-pathway hits. A target validated in one inflammatory indication often scores well across other inflammatory indications that share the same downstream cell-type biology. The IL-23 axis is the canonical example: a target built originally for psoriasis lands at the top of the disease landscape for psoriatic arthritis, inflammatory bowel disease, hidradenitis suppurativa, and several adjacent autoimmune indications. The platform recovers this pattern automatically because the underlying genetic, expression, and perturbation evidence is shared across indications. Not all of those will be commercially attractive, but the analytical view across them gives the program a defensible map of where the biology supports a trial.
Pattern 2: rare-disease anchors for common-disease programs. A target with a strong Mendelian rare-disease association sometimes scores well for a more common indication that shares the same pathway disrupted at lower penetrance. The classic case is loss-of-function variants in lipid-handling genes that anchor in rare familial dyslipidemia and extend into common-population cardiovascular risk. The platform surfaces these patterns because the genetic-association evidence layer is integrated across rare and common, weighted appropriately for each.
Pattern 3: orthogonal phenotypic associations. A target may sit close to a disease in the knowledge graph not because of a direct mechanism but because of a strong association at the tissue or cell-type level. These hits look like surprises and need careful handling. The platform’s view of the supporting evidence makes the difference between a real cross-disease signal and a noisy co-association easier to read; researchers can see whether the score comes from many independent evidence types or from a single well-replicated literature line.
The output of an Indication Selection query is a ranked landscape, not a recommendation. The platform’s role is to make the candidate-disease space navigable. The decision about which indication to pursue still belongs with the team — and depends on factors the platform deliberately stays out of: portfolio fit, competitive intensity, IP position, regulatory pathway, internal capabilities.
What the platform does is make the upstream analytical step shorter. A target’s full cross-disease evidence map, integrated across more than 275 public databases, is available in one query rather than reconstructed manually from a literature search. For programs that work this way habitually, the time saved is not the headline benefit; the consistency is. Two researchers running the same Indication Selection query against the same target get the same ranked landscape. That reproducibility is what makes the analysis usable for portfolio review and for committee-level decisions.
Indication Selection and Target Discovery use the same underlying knowledge graph, run from opposite directions. A typical program uses both: target-discovery work to define the candidate gene against a primary indication, then indication selection to map the cross-disease landscape and inform whether the development plan is single-indication or platform-style.
The combination is the practical case for an integrated knowledge graph. Ranking a single gene-disease pair from one direction is one query; ranking it from both directions and seeing the answers reconcile is the validation that the underlying evidence is consistent.