AUTOMATING THE USAGE OF UNAMBIGUOUS NOES IN NUCLEAR VECTOR REPLACEMENT FOR NMR PROTEIN STRUCTURE-BASED ASSIGNMENTS
Industrial Engineering, M.Sc. Thesis, 2013
Assoc. Prof. Bülent Çatay (Thesis Supervisor), Asst. Prof. Mehmet Serkan Apaydın, Prof. Dr. Uğur Osman Sezerman, Asst. Prof. Kemal Kılıç, Asst. Prof. Hasan Otu
Date &Time: July,22th, 2013 – 13:00
Place: FENS L055
Keywords: Automated NMR Assignment, Tabu Search, NMR Structural Biology, Structural Bioinformatics, Nuclear Overhauser Effect (NOE), Ant Colony Optimization (ACO), Computational Biology, and Metaheuristics.
Proteins perform various functions and tasks in living organisms. The structure of a protein is essential in identifying the protein function. Therefore, determining the protein structure is of upmost importance. Nuclear Magnetic Resonance (NMR) is one of the experimental methods used to determine the protein structure. The bottleneck in NMR protein structure determination is assigning NMR peaks to corresponding nuclei, which is known as the assignment problem. This assignment process is manually performed in many laboratories. In this thesis, we have developed methodologies and software to automate this process.
The Structure Based Assignment (SBA) is an approach to solve this computationally challenging problem by using prior information about the protein that is obtained from a template structure. NVR-BIP is an approach that uses the Nuclear Vector Replacement (NVR) framework to model SBA as a binary integer programming problem. NVR-TS is a tabu search algorithm equipped with a guided perturbation mechanism to handle the proteins with larger residue numbers. NVR-ACO is an ant colony optimization approach that is inspired by the behavior of living ants to minimize peaks-nuclei matching cost. One of the input data utilized in these approaches is the Nuclear Overhauser Effect (NOE) data. NOE is an interaction observed between two protons if the protons are located close in space. These protons could be amide protons (HN), protons attached to the alpha-carbon atom in the backbone of the protein (HA), or side chain protons. NVR only uses backbone protons. In the previous approaches using the NVR framework, the proton type was not distinguished in the NOEs and only the HN coordinates were used to incorporate the NOEs into the computation. In this thesis, we fix this problem and use both the HA and HN coordinates and the corresponding distances in our computations. In addition, in the previous studies within this context the distance threshold value for the NOEs was manually tuned for different proteins. However, this limits the application of the methodology for novel proteins. In this thesis we set the threshold value in a standard manner for all proteins by extracting the NOE upper bound distances from the data. Furthermore, for the Maltose Binding Protein (MBP), we extract the NOE upper bound distances from the NMR peak intensity values directly.
We tested our approach on NVR-BIP's data set and compared our new approaches with NVR-BIP, NVR-TS, and NVR-ACO. The experimental results show that the proposed approach improves the assignment accuracies significantly. In particular, we achieved 100% assignment accuracy on EIN and 78% assignment accuracy on MBP proteins as compared to 83% and 73% accuracies, respectively, obtained in the previous approaches.