Skip to content

to_nexus() does not emit compliant Nexus #1785

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
jeetsukumaran opened this issue Oct 13, 2021 · 4 comments · Fixed by #1836
Closed

to_nexus() does not emit compliant Nexus #1785

jeetsukumaran opened this issue Oct 13, 2021 · 4 comments · Fixed by #1836
Labels
bug Something isn't working Python API Issue is about the Python API

Comments

@jeetsukumaran
Copy link
Contributor

As per the above, the Nexus string emitted by tskit is not standard Nexus and cannot be parsed by most software.

There are a two different problems.

(1) The TAXA block requires a DIMENSIONS NTAX=## statement.

(2) The list of taxon labels should be separated by SPACES, NOT by commas.

The TreeSequence.to_nexus() statement currently emits:

#NEXUS
BEGIN TAXA;
TAXLABELS tsk_0_1,tsk_1_1,tsk_2_1,tsk_3_1,tsk_4_0,tsk_5_0,tsk_6_0;
END;
BEGIN TREES;
	TREE tree0.00000000000000_100.00000000000000 = ((tsk_1_1:0.07086621911734,tsk_2_1:0.07086621911734)tsk_4_0:4.21281381972469,(tsk_0_1:1.20610709140453,tsk_3_1:1.20610709140453)tsk_5_0:3.07757294743750)tsk_6_0;
END;

Multiple program fail to parse this, including PAUP, BEAST, and, in fact, most phylogenetic programs and processing libraries.

Adding a DIMENSIONS statement and separating the taxon labels using spaces makes this not only compliant with standard Nexus, but also readable by most Nexus-supporting phylogenetic programs (whatever else their idiosyncracies! :) )

#NEXUS
BEGIN TAXA;
DIMENSIONS NTAX=7;
TAXLABELS tsk_0_1 tsk_1_1 tsk_2_1 tsk_3_1 tsk_4_0 tsk_5_0 tsk_6_0;
END;
BEGIN TREES;
	TREE tree0.00000000000000_100.00000000000000 = ((tsk_1_1:0.07086621911734,tsk_2_1:0.07086621911734)tsk_4_0:4.21281381972469,(tsk_0_1:1.20610709140453,tsk_3_1:1.20610709140453)tsk_5_0:3.07757294743750)tsk_6_0;
END;
@benjeffery
Copy link
Member

@jeetsukumaran Thanks for this excellent issue report.

Would you like to submit a PR to fix these issues? Contributions to tskit are recognised by authorship on the upcoming tskit paper.

@benjeffery benjeffery added bug Something isn't working Python API Issue is about the Python API labels Oct 13, 2021
@benjeffery benjeffery added this to the Python upcoming milestone Oct 13, 2021
@jeromekelleher
Copy link
Member

(No directly involved here, but related is #1671)

@jeetsukumaran
Copy link
Contributor Author

@jeetsukumaran Thanks for this excellent issue report.

Would you like to submit a PR to fix these issues? Contributions to tskit are recognised by authorship on the upcoming tskit paper.

That sounds great! I would love to contribute! I'll update you (and this issue) when I make progress

@jeromekelleher
Copy link
Member

The basic issue was fixed in #1835. This issue can be closed when we add dendropy based tests of the functionality to confirm.

jeromekelleher added a commit to jeromekelleher/tskit that referenced this issue Oct 22, 2021
Closes tskit-dev#1785

Add dendropy to various requirements lists
Also refactor the tests to use pytest a little better
jeromekelleher added a commit to jeromekelleher/tskit that referenced this issue Oct 22, 2021
Closes tskit-dev#1785

Add dendropy to various requirements lists
Also refactor the tests to use pytest a little better
jeromekelleher added a commit to jeromekelleher/tskit that referenced this issue Oct 22, 2021
Closes tskit-dev#1785

Add dendropy to various requirements lists
Also refactor the tests to use pytest a little better
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working Python API Issue is about the Python API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants