EVEE: Interpretable variant effect prediction from genomic foundation model embeddings

External IDs

doi

10.64898/2026.04.10.717844

title_hash

cd7792015b8d342a02b59f5a59f42b72a4fa0978

legacy_slug

pearce-2026-evee

Frontmatter (14 fields)

{
  "doi": "10.64898/2026.04.10.717844",
  "year": 2026,
  "title": "EVEE: Interpretable variant effect prediction from genomic foundation model embeddings",
  "authors": [
    "Michael T Pearce",
    "Thomas Dooms",
    "Ryō Yamamoto",
    "Joshua Meehl",
    "Carl Molnar",
    "Mark Bissell",
    "Dron Hazra",
    "Ching Fang",
    "Nam Nguyen",
    "Michael Anderson",
    "Collin Osborne",
    "Patrick E. Duffy",
    "Bridget Toomey",
    "Eric W. Klee",
    "Elena Myasoedova",
    "Alexander J. Ryu",
    "Shant Ayanian",
    "Panos Korfiatis",
    "Matt Redlon",
    "Archa Jain",
    "Daniel Balsam",
    "Nicholas K. Wang"
  ],
  "abstract": "Abstract Predicting the clinical significance of genetic variants remains a central challenge in genomic medicine, with most observed variants classified as variants of uncertain significance. Here we show that representations from Evo 2, a 7-billion-parameter genomic foundation model, support accurate and interpretable pathogenicity prediction across variant types from a single framework. An embedding-based classifier, or “probe”, trained on Evo 2 embeddings achieves state-of-the-art performance across single nucleotide variant consequence types (0.997 overall AUROC on 833k ClinVar variants) and generalizes zero-shot to indels (0.991 AUROC), outperforming bioinformatic meta-predictors, protein models, and existing foundation model approaches. Performance is robust across conservation levels and transfers to deep mutational scanning datasets for BRCA1, BRCA2, TP53, and LDLR. To make these predictions interpretable, we train supervised annotation probes to quantify predicted disruptions caused by each variant, then synthesize these disruption profiles into natural language explanations using a frontier reasoning model. We provide pre-computed predictions and on-demand explanations for all 4.2 million ClinVar variants through the Evo Variant Effect Explorer (EVEE), an interactive web resource for the community. This work establishes that representations from genomic foundation models can serve as a unified substrate for both accurate variant effect prediction and mechanistic interpretation, reframing interpretability in computational genomics from a trade-off into a complementary product of learned biological structure.",
  "arxiv_id": null,
  "pdf_status": "not-available",
  "openalex_id": "W7153570004",
  "ingested_via": "ingest_one_url (metadata-only)",
  "openalex_year": 2026,
  "openalex_enriched_at": 1778988553,
  "openalex_match_title": "EVEE: Interpretable variant effect prediction from genomic foundation model embeddings",
  "openalex_cited_by_count": 0,
  "openalex_referenced_works": [
    "W2051978340",
    "W2070730971",
    "W2160995259",
    "W2174602966",
    "W2889874867",
    "W3209435229",
    "W4386861232",
    "W4387966979",
    "W4389081060",
    "W4390616142",
    "W4404821554",
    "W4406150887",
    "W4413312092",
    "W4414055741",
    "W4415708527",
    "W7117243582",
    "W7125967877",
    "W7133517058",
    "W7143417053"
  ]
}

Outgoing (0)

None.

Incoming (22)

Authored by (22)

Alexander J. Ryu(thinker)
Archa Jain(thinker)
Bridget Toomey(thinker)
Carl Molnar(thinker)
Ching Fang(thinker)
Collin Osborne(thinker)
Daniel Balsam(thinker)
Dron Hazra(thinker)
Elena Myasoedova(thinker)
Eric W. Klee(thinker)
Joshua Meehl(thinker)
Mark Bissell(thinker)
Matt Redlon(thinker)
Michael Anderson(thinker)
Michael T Pearce(thinker)
+7 more