A. Schoenhuth; "Exploring the Twilight...", April 25, 10:40, L067
  • FENS
  • A. Schoenhuth; "Exploring the Twilight...", April 25, 10:40, L067

You are here

Faculty of Engineering and Natural Sciences



Exploring the Twilight Zone of Bacteria: An Insertion and Deletion Study


Alexander Schoenhuth

Although insertions and deletions (indels) are a common type of evolutionary sequence
variation, their evolutionary origins and their functional consequences have not yet been comprehensively understood. There is evidence that gap penalty models employed by
classical alignment procedures only roughly reflect the factual evolutionary processes. There is also evidence that indels can easily cause structural changes in the proteins' surfaces.  In addition to the classical evolutionary processes, bacteria are also subject to horizontal gene transfer which facilitates rapid adaption to environmental changes. herefore, studying their paralogous protein pairs is of particular interest.

In analogy to the classical alignment score statistics, we have developed a sound statistical framework, based on pair hidden Markov models, that allows for efficient computation of tables that report significant indel lengths, for classical dynamic programming procedures with affine gap penalty scoring schemes. We obtained paralogous protein pairs in E.~coli by computation of global alignments with affine gap penalties.  In a second step, we grouped the aligned protein pairs into indel, non-indel and insignificant pairs, according to our novel statistics. We measured functional similarity between them by computation of Gene Ontology (GO) based functional distances, as recently suggested. We found that, in the twilight zone of the pairs of 20\% to 40\% identity, indel pairs are significantly less functionally similar than non-indel pairs. This suggests that indels cause more severe functional changes than substitutions only.

Alexander  Schoenhuth

PhD: 2006, University Cologne, Germany in Information Theory, on discrete-valued random sources, in particular relatives of (hidden) Markov sources. In parallel, projects in computational biology in collaboration with the Max Planck institute for Molecular Biology, Berlin. Since then, I've been a postdoctoral researcher of the Pacific Institute for the Mathematical Sciences, affiliated with the School of Computing Science, supervised by Cenk Sahinalp and Martin Ester. My research interests are information theory, Markovian models and statistical learning as well as systems biology and drug target research.

April 25, 2008, 10:40, FENS L067