Generalist large language models complement tailor-made predictors for tumor genomics interpretation

Preprint Created on 23 May 2026 bioRxiv

General-purpose large language models (LLMs) are trained on large corpora to acquire broad knowledge, but whether LLMs can replace, or augment, task-specific models is unclear. We evaluated LLMs on three real-world, clinically important tumor genomic interpretation tasks, in order of increasing difficulty: (i) distinguishing tumor from non-tumor mutations (n=34,415 variants), (ii) distinguishing driver from passenger mutations (n=13,469 variants), and (iii) inferring cancer type from tumor sequencing reports across multiple assays and institutions (n=102,791 samples). The best general-purpose LLMs performed as well as the benchmark tailor-made predictor for task (i). Ensembling tailor-made models with zero-shot LLMs improved their performance for tasks (i) and (ii). For task (iii), LLMs outperformed or supplemented tailor-made models on out-of-distribution data. Without fine-tuning, current LLMs already can be useful in clinical genomic interpretation by adding complementary expertise to tailor-made, state-of-the-art predictors.

Yu, J., Darmofal, M., Waters, M., Choy, J., Tran, T. N., Fu, C., Morales, L., U, K., Levine, R. L., Schultz, N., Berger, M. F., Morris, Q., Jee, J.

Attention!

To access all content shared on our platform and the source link, please sign up for an account. If you already have an account, sign in, or connect with LinkedIn, Google.

Stats

Recommendations n/a n/a positive of 0 vote(s)
Views 11
Comments 0

Comments

There are no comments yet.

Attention!

Stats

Recommended by

Post a comment

Comments