Streptococcus pyogenes (group A Streptococcus, GAS) causes over 700 million infections annually and has resurged globally as a cause of invasive disease. However, the genomic basis by which distinct GAS lineages produce diverse clinical phenotypes remains incompletely understood. Here, we analyzed 399 quality-controlled complete genomes spanning 103 emm types and 27 countries and applied non-negative matrix factorization (NMF) to decompose the GAS pangenome into co-inherited gene modules capturing both clonal lineages and mobile elements. This framework revealed that the deepest division in the accessory genome is not defined by emm type, but by a 27-gene Sda-1-encoding prophage linked to the hypervirulent M1T1 pandemic clone. This NET-degrading module unites emm1, emm12, and emm77 lineages, including fixation in emm77/ST63, extending its distribution beyond classically invasive lineages. Across lineages, virulence determinants segregate into distinct combinations, indicating that invasive disease arises through convergent but mechanistically distinct programs. Consistent with this, highly invasive ST52-emm28 lacks Sda1 and deploys an alternative repertoire. The decomposition also resolves a gene module corresponding to the emm-pattern D regulon associated with skin tropism, linking accessory genome structure to host niche adaptation. We further identify three sequence-divergent speC paralogs on independent prophages. These findings define modular GAS virulence architectures and establish NMF-based pangenome analysis as a framework for genomic surveillance beyond single-locus typing.
Chauhan, S. M., Monk, J., Palsson, B. O., Nizet, V.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 8
- Comments 0
