The assignment of protein NMR spectra using a genetic algorithm
thesisposted on 15.12.2014, 10:31 by Bartlet Gilbert. Ailey
NMR spectroscopy is one of the two methods for determining the structures of proteins. The production of a structure using NMR has a number of phases; with the assignment phase being one of the most time consuming. Any automation, even partial, of the assignment process would be of enormous benefit. This thesis describes five modules (2D-SAM, 3D-SAM, BAM-1, BAM-2 and SCAM) that use a genetic algorithm (GA) to assign protein NMR spectra. The 2D-SAM and 3D-SAM are Sequential Assignment Modules. They take the relevant spin system identification and sequentially assign either 2 dimensional homonuclear or 3 dimensional heteronuclear NOESY spectra. The 2D-SAM is effective with small proteins which generate high quality spectra while the 3D-SAM is effective with larger isotopically labelled proteins. The BAM-1 and BAM-2 are Backbone Assignment Modules. The BAM-1 takes several triple resonance spectra and assigns the peaks to relevant nuclei creating peak systems. The BAM-2 takes the peak systems and sequentially assigns them. The SCAM is a prototype Side Chain Assignment Module; it is designed to take either a HCCH C13 TOCSY or COSY spectrum and assign its peaks to certain types of amino acid. The BAM-1, BAM-2, and SCAM were designed to work in sequence to assign a whole protein. Although each module is designed to assign a specific type of spectrum or spectra they are all based around the same GA core. This core uses a crowding factor, phenotypic domain specific genetic operators and a novel age concept to improve its performance. When evaluated the performance of each module (average correct assignment) was 2D-SAM 100%, 3D-SAM 71%, BAM-1 96% and BAM-2 75%.