SAI: A Python Package for Statistics for Adaptive Introgression
SAI: A Python Package for Statistics for Adaptive Introgression
Huang, X.; Chen, S.; Hackl, J.; Kuhlwilm, M.
AbstractAdaptive introgression is an important evolutionary process, yet widely used summary statistics--such as the number of uniquely shared sites and the quantile of the derived allele frequencies in such sites--lack accessible implementations, limiting reproducibility and methodological clarity. Here, we present SAI, a Python package for computing these statistics, and apply it to three datasets. First, using the 1000 Genomes Project data, we replicated previously reported candidate regions and identified additional ones, including a region detected by studies using supervised deep learning. Second, reanalysis of a Lithuanian genome dataset revealed no candidates in the HLA region. Finally, we investigated bonobo introgression into central chimpanzees and identified a candidate region that overlaps a high-frequency Denisovan-introgressed haplotype block reported in modern Papuans--an intriguing co-occurrence across divergent lineages. Discrepancies with prior results highlight the importance of transparent and reproducible analysis workflows, especially as machine learning becomes increasingly prevalent in evolutionary genomics.