-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Feature request: least common ancestor for equally scoring haplogroups #31
Comments
Hi Stephen, sorry for my late response. Makes sense to implement this directly in Haplogrep. I already talked to Pete at ASHG about this feature. For haplocheck we already implemented a method to find the LCA of two haplogroups (major and minor). Nevertheless, If you have a code snippet ready, would be great to have a look at. |
We've been doing this externally using some R code to traverse phylotree, which was put into a JSON structure using some python code Pete found. We'll try to clean this up and provide an example. cc @vpnagraj |
@seppinho thanks for circling back to this! yeah we put together a method for calculating mito haplogroup lca in R. it's a bit involved:
you'll need a text file with all possible haplogroups: and a copy of the phylotree data in json format (not attached but there are parsers out there: https://github.com/munky69rock/phylotree-parser-python). i guess i should also note that i did see that you all have an xml version of the tree. some of the code above tries to parse an xml input, but i didn't have reliable success with that and moved forward with json. anyways, with the above you can run the lca:
hope this gives you an idea of what we have in mind. please let us know if there's anything else you need on our end. |
Dear Dr seppinho : I try to use fantastic tool-Haplogrep2 from your GitHub and I have some questions .
Thanks ! Best regards, |
The feature released in 497cf6c in response to #12 allows for the export of the best n haplogroups.
When running with microarray data using the
--chip
flag, often there are many equally top-ranked haplogroups with the same score, due to the variants on the array being too low-resolution to distinguish between different leaves on the phylogeny.We have implemented a post-hoc LCA approach where we first parse Phylotree to JSON, then traverse the tree given multiple haplogroups to find the LCA haplogroup node for all the supplied haplogroups.
Having this functionality natively implemented could be useful. I could imagine this option (e.g.
--lca
) could be exclusive with the--hits
option.Feel free to close this issue if this feature request is beyond the scope of haplogrep itself. Thanks!
The text was updated successfully, but these errors were encountered: