The conformational ensembles of intrinsically disordered proteins (IDPs) respond sensitively to ionic strength, with the direction and magnitude of response varying widely across sequence classes from polyelectrolyte contraction to polyampholyte swelling. Recent sequence-conditioned models provide rapid access to full ensembles or ensemble-averaged properties at specified solution conditions, but do not directly identify which low-dimensional polymer physics descriptors organize salt response behavior. Here, we construct a 511-sequence library spanning controlled {kappa}-variants, NCPR series, IDRome-stratified natural IDRs, and low-FCR IDRome sequences, and perform 2,555 CALVADOS-2 simulations across five monovalent salt concentrations (50 mM to 500 mM). For each sequence, we extract the salt-response slope, dRg/d[salt], and assign one of four regimes: polyelectrolyte contraction, polyampholyte swelling, non-monotonic response, or salt-insensitive behavior. Using eight theory-motivated sequence descriptors, we find that sequence charge decoration weighted by chain length, SCD x N , is the dominant coordinate organizing salt response, accounting for {approx}40% of total SHAP attribution and exceeding the next feature by more than twice. Ridge regression explains substantial in-distribution variance (R2= 0.83 under random cross-validation), whereas gradient-boosted trees improve in-distribution performance (R2= 0.97) and retain predictive power under the more stringent leave-one-subset-out validation test (R2 = 0.60), indicating that salt response contains transferable but nonlinear sequence-encoded structure. Regime classification robustly recovers the direction of salt response, with no polyelectrolyte-polyampholyte confusion, whereas non-monotonic and salt-insensitive sequences remain harder to distinguish from static sequence features alone. Together, these results establish SCD x N, as a compact, interpretable organizing coordinate for CALVADOS-2-derived IDP salt response and provide a polymer-physics, feature-level complement to ensemble-level generative models.
Aryal, M.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 4
- Comments 0
