Premium accounts now available! Sign up and create a premium account. Read more Close

Advertisement

Image

Reconstructability of evolutionary intermediates in generative epistatic landscapes

Preprint Created on 20 Jun 2026 bioRxiv

Evolutionary intermediates connect observed proteins, but the sequence of steps that produced them is rarely recoverable from extant data alone. Here we ask what can, and cannot, be inferred about such intermediates from the endpoints. Using generative sequence landscapes as controlled models of protein-family evolution, we benchmark data-driven reconstruction against ground-truth simulated trajectories. We find that the best point prediction is not necessarily the most faithful evolutionary reconstruction: maximum-likelihood intermediates can be residue-wise accurate yet statistically atypical, whereas conditional sampling better captures the ensemble of plausible histories. Predictability is limited by the topology of the landscape. Constrained, low-mutability regions preserve information about the path, while permissive high-mutability regions open many alternative routes and erase path-specific memory. We also show that sequence divergence alone is an insufficient measure of elapsed evolutionary time; incorporating endpoint mutability provides a more reliable way to place intermediates in the landscape. These results recast intermediate reconstruction as a calibrated probabilistic problem. Rather than seeking a single ``true'' sequence, data-driven models should identify when endpoints contain evolutionary information, and return realistic ensembles.

Netti, R., Weigt, M.

Advertisement

Stats

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 1
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement