Welcome to DTL-RnB

What is DTL-RnB?

DTL-RnB (Reconciliation Browser) is a tool for browsing the potentially large landscape of maximum parsimony reconciliations (MPRs) of pairs of phylogenetic trees (e.g., species and gene trees or host and parasite trees). This tool is based on algorithms and techniques described in the paper “DTL-RnB: Algorithms and Tools for Summarizing the Space of DTL Reconciliations” by Weiyun Ma, Dmitriy Smirnov, Juliet Forman, Annalise Schweickart, Carter Slocum, Srinidhi Srinivasan, and Ran Libeskind-Hadas (under review).

DTL-RnB takes as input a pair of undated phylogenetic trees in newick format, along with their tip associations. DTL-RnB prompts the user for non-negative costs for duplication, transfer, and loss costs (speciations are assumed to be null events and are set to cost zero) and a method for scoring events (see “Scoring Functions" below). The tool then displays a sequence of maximum parsimony reconciliations that, collectively, come close to maximizing the total sum of the scores of the events with respect to the given scoring function. For example, if all events have unit scores, then the tool seeks to find a small set of maximum parsimony reconciliations that include as many events as possible from the set of events found in all maximum parsimony reconciliations for the given input.

More formally, DTL-RnB seeks to solve the following k-Reconciliations Cover Problem:
Input: A pair of phylogenetic trees, a mapping between the tips of the trees, DTL event costs, a scoring function that assigns a non-negative score to each event found in any maximum parsimony reconciliation, and a parameter k.
Output: A set of k maximum parsimony reconciliations that maximize the sum of the event scores. (Note that an event score is only counted once in this sum, but the same event may still appear in more than of the k maximum parsimony reconciliations.)

Although we have shown that this problem is NP-complete, DTL-RnB uses a polynomial-time approximation algorithm that is guaranteed to find solutions that are within 1 - 1/e (approximately 0.632) times optimal. In other words, if the optimal solution could “collect” OPT points using k maximum parsimony reconciliations, our algorithm will collect at least 0.632 OPT points. The algorithm is based on an existing approximation algorithm for the Maximum Coverage Problem.

Scoring Functions

DTL-RnB currently supports three scoring functions, although many others are possible and can be added by modifying the available source code.

  • Unit: Each event in the reconciliation graph has a score of 1. Thus, the kRCP problem seeks to find k reconciliations that include as many events as possible in the space of MPRs.
  • Frequency: Each event in the reconciliation graph is scored by its frequency, namely the number of MPRs that contain that event divided by the total number of MPRs.
  • Region-Based: This function scores each event with its robustness with respect to perturbations in event costs using our Xscape tool. The underlying idea is that an event that occurs in many MPRs in a specified range of “nearby” event costs should have a high score and one that occurs in fewer MPRs in that range should have a lower score. Since event costs are unit-less, we assume that the cost of duplication is normalized to 1 and the cost of transfer and loss are relative to this normalized duplication cost. Given an original set of DTL event costs (1, T, L), we first compute the space of MPRs using xScape. Next, given two parameters x and y, we compute the regions for transfer costs ranging from T - x to T + x and for loss costs ranging from L - y to L+y. For each such region, we compute the set of events that arise in the MPRs in that region. Then, for each event, its score is the total number of regions that contain that event. Finally, we normalize scores to be between 0 and 1. Ultimately, a score of 1 indicates that this event occurs in at least one MPR in every region in the given range whereas a score close to 0 indicates that the event occurs in relatively few other regions. Note that this scoring function assigns a strictly positive cost to every event since every event occurs in some reconciliation in the region corresponding to the original event costs.

Sample Files

Making the Images Bigger

Reconciliations are displayed in .svg format. In large trees, it may be hard to see the details (e.g., event types, scores, and taxa names). The images can be downloaded (e.g., by using CTRL and clicking on the image) and then viewed in an image viewer where they can be magnified. We recommend using ImageMagick to first convert the image into jpeg format.

Acknowledgements

The authors thank Yi-Chieh Wu for her generous assistance and advice on numerous aspects of this project, Mukul Bansal for useful conversations, and Matthew D. Rasmussen for the development of the vistrans software that is used in DTL-RnB.

This work was funded by the National Science Foundation under Grant Number IIS-1419739. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Questions and Comments

Please address questions and comments to Prof. Ran Libeskind-Hadas at hadas@cs.hmc.edu.