-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
output taxid for certain rank or complete lineage #8
Comments
It's a good suggestion. But in some cases, e.g., viruses which do not have that much rank levels. These will be some missing/"unclassified xxx" ranks, which actually do not refer to any taxid. For example, for taxid
|
Even if there are not that much rank levels, the latest rank is still species (see https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?name=1327037). |
Implemented in v0.2.4-dev3 or later versions. Note that: Lots of taxids share same taxon names like "diastema", "solieria" and "environmental samples", therefore it's not just about simply mapping taxon name to taxid. I use both taxon name and the name of it's parent taxid to get the right taxid. This also bring more accurate result when using flag PS: I spent 7 hours 😫
|
Thanks for your great work. I suggest taxid |
Thanks for your suggestions. I was fool. I'll improve it soon. |
Code has been optimized for speed and memory occupation. You may redownload the binaries And
|
@tolot27 Tell me if you have more suggestions or feedback, I would like to release a new version soon. |
Two points I thought about: First, with the parameter |
For the first issue, you may use a special symbol and remove it at last. Subspecies is supportted using place holder |
I know, but that was not my point. I talked about a column containing the taxid of the subspecies if defined or the taxid of the species, if no subspecies is defined. Currently, the subspecies taxid column is empty, if no subspecies is defined, which is correct, indeed. But a merged column is required, for instance for KronaTools. But probably, that's out of scope of taxonkit. |
Yes, that can be done using |
v0.2.4 is out. You can set prefix "unclassified" by flag |
Since no "deepest" taxid of supported/requested ranks exist, you can close this issue. |
Like the format string in
reformat
, it would be interesting to have placeholders for the taxids of the available ranks.Most useful for downstream analyses and visualization, i.e. of metagenomic data, would be the taxid of the species and subspecies rank. Often, the taxonomic classifiers are randomly or incorrectly choosing a certain strain or a lot of different strains of the same species/subspecies. That clutters the output.
Having such a placeholder can be easily used to filter the dataset during visualization rather than during processing.
The text was updated successfully, but these errors were encountered: