R-loop Prediction Reveals Generalization Limits of DNA Foundation Models Beyond Regulatory Genomics

DNA foundation models are increasingly proposed as general-purpose representations for genomic prediction and design, yet their evaluation remains largely centered on conventional regulatory tasks. This leaves a critical question unresolved: do DNA foundation models generalize to sequence biology beyond conventional gene regulation? To answer this question, we introduce RloopBench, a systematic benchmark for R-loop-forming sequence prediction as a biophysically distinct, genome-stability-associated task. We compare rule-based methods, task-specific models, classical sequence encodings, and foundation model representations across in-distribution, cross-platform, consensus-level, and cross-species evaluations. Foundation models achieve strong performance when positive and negative sequences are compositionally separable, but this advantage does not consistently transfer to cross-platform and cross-species settings, where they are often comparable to classical k-mer representations. Unexpectedly, a one-hot classifier baseline shows the strongest overall sensitivity to R-loop-forming sequences, exceeding more complex models across several generalization tests. Rule-based and task-specific models also exhibit limited transfer outside their original training regimes. Performance is further shaped by sequence properties, negative-control design, experimental platform, and species-specific genomic context. Together, RloopBench establishes genome-stability-associated sequence prediction as a complementary direction for DNA foundation model development and evaluation, while underscoring that simple sequence encodings remain necessary baselines for assessing model generalization beyond conventional regulatory tasks.

Zhang, Y., Ganesan, A., Lin, X.

Attention!

To access all content shared on our platform and the source link, please sign up for an account. If you already have an account, sign in, or connect with LinkedIn, Google.

Stats

Recommendations n/a n/a positive of 0 vote(s)
Views 8
Comments 0

Comments

There are no comments yet.

Attention!

Stats

Recommended by

Post a comment

Comments