Massively parallel sequencing of autosomal STRs and identity-informative SNPs highlights consanguinity in Saudi Arabia

While many studies have been undertaken of Middle Eastern populations using autosomal STR profiling by capillary electrophoresis, little has so far been published from this region on the forensic use of massively parallel sequencing (MPS). Here, we carried out MPS of 27 autosomal STRs and 91 identity-informative SNPs (iiSNPs) with the Verogen ForenSeq™ DNA Signature Prep Kit on a representative sample of 89 Saudi Arabian males, and analysed the resulting sequence data using Verogen's ForenSeq Universal Analysis Software (UAS) v1.3 and STRait Razor v3.0. This revealed sequence variation in the composition of complex STR arrays, and SNPs in their flanking regions, which raised the number of STR alleles from 238 distinct length variants to 357 sequence sub-variants. Similarly, between one and three additional polymorphic sites were observed within the amplicons of 37 of the 91 iiSNPs, forming up to six microhaplotypes per locus. These further enhance discrimination compared to the biallelic target SNP data presented by the primary UAS interface. In total, we observed twenty-two STR alleles previously unrecognised in the STRait Razor v3.0 default allele list, along with nine SNPs flanking target iiSNPs that were not highlighted by the UAS. Sequencing reduced the STR-based random match probability (RMP) from 2.62E-30 to 3.49E-34, and analysis of the iiSNP microhaplotypes reduced RMP from 9.97E-37 to 6.83E-40. The lack of significant linkage disequilibrium between STRs and target iiSNPs allowed the two marker types to be combined using the product rule, yielding a RMP of 2.39E-73. Evidence of consanguinity was apparent from both marker types. While TPOX was the only locus displaying a significant deviation from Hardy-Weinberg equilibrium, 23 out of 27 STRs and 63 out of 91 iiSNPs showed fewer than expected heterozygotes, demonstrating an overall homozygote excess probably reflecting the high frequency of first-cousin marriages in Saudi Arabia. We placed our data in a global context by considering the same markers in the Human Genome Diversity Panel (HGDP), revealing that the Saudi sample was typical of Middle Eastern populations, with a higher level of inbreeding than is seen in most European, African and Central/South Asian populations, correlating with known patterns of endogamy. Given reduced levels of diversity within endogamous groups, the ability to combine the discrimination power of both STRs and SNPs offers significant benefits in the analysis of forensic evidence in Saudi Arabia and the Middle East region more generally.