Premium accounts now available! Sign up and create a premium account. Read more Close

Advertisement

Image

Generalized Chargaff symmetry in codon usage across the tree of life

Preprint Created on 23 Jun 2026 bioRxiv

Codon usage bias is a central record of mutation, selection, drift, and translational constraints, but it is usually treated separately from generalized Chargaff symmetry, the tendency for words and their reverse complements to occur at similar frequencies in long DNA sequences. Here we ask whether codon usage contains a measurable reverse-complement component, and whether departures from that component can be quantified. We first avoided imposing reverse-complement symmetry. Instead, we exhaustively evaluated all nontrivial symbolic involutions (f (f (x)) = x) of the 64 codons, constructed from self-inverse nucleotide maps and self-inverse codon-position permutations. Across taxa, the reverse-complement transformation was the optimum, giving the highest median correlation between codon frequencies and transformed codon frequencies. Random and amino-acid-preserving reference models showed that the signal is not a generic property of codon profiles and is only partly explained by protein composition. Additional controls preserving aminoacid composition and matching GC3 in expectation showed that the observed reverse-complement correlation remains higher than expected from these constraints alone, and genus-level aggregation confirmed that the optimum is not driven by overrepresented genera. The symmetry breaks in a biologically ordered manner: the third, most degenerate codon position remains closest to the reverse-complement baseline, whereas the first position departs most strongly and the second is intermediate and lineage dependent. Taxonomic comparisons reveal broad and fine-scale heterogeneity in the degree of codon-level symmetry preservation. Together, these results show that codon usage can be quantified as a combination of reverse-complement preservation and position-, lineage-, and function-dependent departures from the GCT-associated compositional baseline, linking sequence-level symmetry, evolutionary mechanisms, and codon-level organization within a single measurable framework.

Fariselli, P.

Advertisement

Stats

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 0
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement