Algorithms for Computational Biology
This project will investigate algorithm design and analysis for problems in computational biology. This research has a significant theoretical component (analyzing the computational complexity of problems and developing algorithms and heuristics) as well as some system-building and testing (implementing our algorithms and testing them against existing methods currently used by biologists). We will work closely with biologists throughout the summer.
An example of a problem of interest arises in the field of cophylogeny. We are given the phylogenetic tree (also known as an evolutionary tree) for some group of "host" species (e.g. gophers) and a second phylogenetic tree for a group of "parasite" species (e.g. lice); we believe that the two species coevolved.
The two trees have the current species as their leaves and the internal nodes are hypothesized ancestral species which no longer exist. In addition to these two trees, we are given a mapping between the leaves of the parasite tree to the leaves of the host tree, indicating the parasite species that are currently specialized to the the existing host species.
These two trees are almost never completely identical in their structure. (That is, they are not isomorphic.) However, there are certain biologically plausible events that can explain the deviation between these two trees. The fundamental question is "what is the most plausible set of events that would explain the coevolution of parasite and host species?" Our research group has made some progress on this problem (including a recent paper to appear in the Journal of Computational Biology) but there are many interesting issues that have not yet been resolved. Ultimately, we would like to develop general algorithms and software tools that would allow evolutionary biologists to investigate any host/parasite pair of phylogenetic trees.
Mentor: Professor Ran Libeskind-Hadas
Ran has been a professor at Harvey Mudd since 1993. His general research interests are in algorithm design and analysis. He has recently begun working on algorithmic problems in computational biology. He received is B.A. in applied mathematics from Harvard University and his M.S. and Ph.D. in computer science from the University of Illinois at Urbana-Champaign. In addition to research and teaching, he enjoys hiking, biking, and playing with his two children.
Required Background
Students working on this project should have completed an undergraduate course in algorithms. In particular, familiarity with algorithm design paradigms such as dynamic programming are important as are knowledge of basic graph algorithms (e.g. shortest path algorithms, network flow, etc.). Some exposure to the theory of NP-completeness is desirable but not absolutely required. Applicants should clearly indicate all courses and experiences related to discrete mathematics, algorithms, and complexity theory.


