PRDM9 Diversity, Recombination Landscapes and Childhood Leukaemia
thesisposted on 13.07.2020, 13:02 by Ihthisham Ali
PR/SET domain 9 (PRDM9) protein is a potent selector of meiotic recombination activation sites via DNA sequence recognition and is known to be highly polymorphic in the critical Zinc Finger (ZnF) array. This diversity can influence the recombination landscape and translate into genome-wide differences between major population groups. PRDM9 variation has also been implicated in genomic instability in cancer. To investigate a potential link between rare PRDM9 alleles and elevated ancestral recombination rates in the Major Histocompatibility Complex II (MHCII) region associated with childhood Acute Lymphoblastic Leukaemia (ALL) in a British cohort, two sub-regions were targeted using sperm-typing methods. This revealed that the DNA3 hotspot is PRDM9 A-regulated, whilst the African-enriched AA hotspot is activated by rare C-type (Ct) alleles containing Ktype ZnFs. However, the latter may not be as active as historical population estimates indicate, or unsampled PRDM9 alleles may activate this hotspot more efficiently. Screening for Ct alleles and other K-ZnF containing alleles in the British ALL cohort provided potential links with PRDM9, though not strong support for previous work. Investigation of other candidate genome-wide associated markers indicated a link with FIGNL-1, a protein complex involved in homologous recombination, which was supported by an independent German cohort.
A large Next-Generation Sequencing (NGS) dataset including rarely sampled populations was used for PRDM9 allele discovery, along with a comparison study on the capabilities of NGS platforms to characterise the long PRDM9 ZnF arrays. The Illumina 100bp paired-end read format was useful for filtering known alleles but de novo assembly was unable to resolve ZnF array structure. Ion Torrent 400bp read data provided only incremental improvement over 200bp reads. Finally, nanopore sequencing showed promising results although improved basecalling and read mapping methods as well as de novo assembly would be required to displace Sanger as the definitive method for ZnF array structure characterisation.