Premium accounts now available! Sign up and create a premium account. Read more Close

Advertisement

Image

sstar2: A Python Package for S*-based Archaic Introgression Detection with Machine Learning

Preprint Created on 03 Jun 2026 bioRxiv

Detecting introgressed genomic fragments from unsampled or extinct source populations remains challenging. The S* statistic is widely used for this purpose, but the original sstar implementation relies on generalized additive models to smooth quantile-specific values precomputed from fixed count bins, requiring simulations with fixed numbers of segregating sites. Here, we present sstar2, a Python update that replaces this procedure with quantile regression to directly estimate S* thresholds at specified null quantiles from simulated genomic windows. We benchmarked sstar2 against the original sstar, linear quantile regression, and random forest quantile regression across three demographic models with both phased and unphased simulated data. sstar2 showed the best overall performance among the evaluated methods, with the most pronounced improvement under a challenging demographic model of ghost introgression in bonobos. These results show that sstar2 improves S* threshold calibration while making S*-based introgression analyses more flexible and compatible with modern simulation workflows.

Koca, A., Stöckl, A., Chen, S., Kuhlwilm, M., Huang, X.

Advertisement

Stats

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 13
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement