Undergraduate researchers Owen and Dalya publish lead-author papers!
Congratulations to undergraduate researchers Owen Moosman and and Dalya Salih on the publication of their first lead-author manuscripts! Abstracts and links to the manuscripts are provided below!
Salih, D., E.E. Armstrong, C.T. Robbins, L.P. Waits, and J.L. Kelley. 2025. Bridging the gap between legacy PCR-based micro satellite data with high-throughput sequencing data in conservation genomics. Journal of Heredity. doi.org/10.1093/jhered/esaf090
Abstract
Microsatellites are powerful markers for tracking genetic variation in wildlife populations due to their high polymorphism and genome-wide abundance. While PCR-based fragment size analysis has been the standard for genotyping microsatellites, high-throughput sequencing offers greater resolution and the opportunity to sync historical datasets with modern analyses. We evaluated how genotypes from whole-genome sequencing align with PCR data for 15 microsatellite loci in 11 North American brown bears (Ursus arctos). Brown bear populations in the lower 48 United States have declined from approximately 50,000 to fewer than 2,000 over the past decades. Their endangered status has prompted extensive research and genetic monitoring, yielding large, multi-year microsatellite datasets upon which future conservation efforts can build. We achieved a microsatellite genotype concordance rate of 94.5% with PCR results. All discrepancies occurred at complex loci containing multiple insertions and/or deletions (indels). Physically linked indels or single nucleotide polymorphisms (SNPs) occurring within the loci were misinterpreted as independent insertions, underscoring the need for genotyping tools that incorporate phasing when genotyping. To evaluate coverage effects, we downsampled from 30x to 2x. Concordance remained high at 20–30x but dropped sharply at 10x, with 5x and 2x having discordant genotypes or insufficient coverage for genotyping. Accurate genotyping required both sufficient depth and number of reads spanning the entire repeat regions. Our results show that short-read whole-genome sequencing can recover microsatellite genotypes with high accuracy when paired with careful variant interpretation. By aligning historical PCR datasets with modern sequencing data, we can preserve decades of genetic insight and strengthen long-term monitoring of at-risk populations.
Moosman, O.W., J.L. Kelley, and S.N. Bogan. 2025. Mitigating assembly and switch errors in phased genomes of polar fishes reveals haplotype diversity in copy number of antifreeze protein genes. Heredity. doi.org/10.1038/s41437-025-00803-8
Abstract
Phased genomes and pangenomes are enhancing our understanding of genetic variation. Accurate phasing and assembly in repetitive regions of the genome remain challenging, however. Addressing this obstacle is crucial for studying structural genomic variation, such as copy number variations (CNVs) common to repetitive regions. Polar fishes, for example, evolved repetitive tandem arrays of antifreeze protein (AFP) genes that facilitate adaptation to freezing and expanded in copy number in colder environments. AFP CNVs remain poorly characterized in polar fishes and may be illuminated by haplotype-aware approaches. We performed long-read sequencing for two polar fishes in the suborder Zoarcoidei and leveraged additional published long-read data to assemble phased genomes. We developed a workflow to measure haplotype diversity in CNV while controlling for misassembly and switch errors—a change from one parental haplotype to another in a contiguous assembly. We present gfa_parser, which computes and extracts all possible contiguous sequences for phased or primary assemblies from graphical fragment assembly (GFA) files, and switch_error_screen, which flags potential switch errors. gfa_parser revealed that assembly uncertainty was ubiquitous across AFP array haplotypes and that standard processing of graphical fragment assemblies can bias measurement of haplotype CNVs. We detected no switch errors in AFP arrays. After controlling for misassembly and switch error, we detected haplotype diversity of AFP CNVs in all studied polar Zoarcoidei species and in 60% of AFP arrays. Intraindividual haplotype diversity spanned differences of 3–16 copies. Our workflow revealed intraspecific genomic diversity in zoarcoids that likely fueled the evolution of AFP copy number across temperature.