Jane File Formats

This page describes the file formats supported by Jane. When you click on a file, it will render in your browswer without whitespace separation. To see the file more clearly, you can either view the page source in your browser or download the contents. The examples provided here are small synthetic ones intended only to illustrate the format. If you are interested in trying Jane with some real biological problem instances, you can find them here.

Format 1: .nex files

Here are a few synthetic example nexus files:
Un-timed
Timed
A nexus file must begin with the comment #NEXUS, followed by a series of blocks. A block is of the form
begin blockname;
internal data
endblock;
We expect three blocks:

Host/Parasite

Host and Parasite should have a single line,
tree host = tree;
tree parasite = tree;
respectively, where tree is defined by the following grammar:
T → (T,T)
T → Species Name
To indicate time zone information for a vertex in the tree, add [time zone] or [time tone,time zone] after the corresponding T. If time zone information is included anywhere, it has to be included everywhere, as you see in the example.

Distribution

The distribution block should contain a line beginning with Range, followed by a list of pairs of parasite:host.

Format 2: .tree files

Here are some synthetic example tree files:
Un-timed
Timed
Regioned
A tree file must consist of a series of blocks: HOSTTREE, HOSTNAMES, PARASITETREE, PARASITENAMES, PHI, HOSTNAMES, HOSTRANKS, PARASITERANKS, and optionally HOSTREGIONS and REGIONCOSTS, in that order.
HOSTTREE and PARASITETREE should consist of a series of entries, one per line, one for each vertex of the tree, of the form
vertex child1 child2
for internal vertices, or
vertex null null
for tips.
Every vertex here needs to be represented by a number

HOSTNAMES and PARASITENAMES should be a series of lines listing the parasite/host's number, a tab, then a human-readable name for the host/parasite.

PHI should be a list of host number followed by parasite number, one for each parasite. This indicates that a parasite infects a particular host.

HOSTRANKS and PARASITERANKS should be lines with vertex time zone, where both of those are numbers.

HOSTREGIONS should be like HOSTRANKS of PARASITERANKS but, with region numbers instead of time zone numbers.

REGIONCOSTS should be a list of triples indicating the region from which a switch occurs, the region to which the switch occurs, and the additional cost of such a switch. This list may be incomplete, and missing entries will be assumed to be zero.

Back to Jane homepage