You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The issue is related to the user-provided proteins feature and its associated issues.
I am trying to use bakta to perform annotation on a phage predicted protein file that used Phanotate. I was expecting an annotation to every protein of my input file but it seems that overlapped proteins are being filtered by bakta.
-I would like to deactivate the overlap detection so bakta does not filter the previously predicted proteins that I am using as input.
But I get this output, the protein for WARQSXNU_10 is missing probably because of the overlap in the genome.
gene complement(40007..40405)
/locus_tag="MKOBIG_00315"
CDS complement(40007..40405)
/db_xref="SO:0001217"
/db_xref="UniRef:UniRef50_W7P0V4"
/db_xref="UniRef:UniRef90_A0A1B1W263"
/db_xref="UserProtein:WARQSXNU_11"
/product="hypothetical protein"
/locus_tag="MKOBIG_00315"
/protein_id="gnl|Bakta|MKOBIG_00315"
/translation="MPAKLRGVRKAVERTSQIVDEIIATKAVRALKSATYIIRTESATL
TPIDTSTLINSQFDTVEVSGTRITGKVGYSAKYALYVHNASGKLAGKPRSNGNGTYWSP
GGEPQFLTKAAQRTKDLVDGVIKKEMKL"
/codon_start=1
/transl_table=11
/inference="ab initio prediction:Prodigal:2.6"
/inference="similar to AA
sequence:UniRef:UniRef90_A0A1B1W263"
gene complement(40743..41135)
/locus_tag="MKOBIG_00320"
CDS complement(40743..41135)
/db_xref="SO:0001217"
/db_xref="UniRef:UniRef50_A0A173GBZ4"
/db_xref="UniRef:UniRef90_A0A1B1W265"
/db_xref="UserProtein:WARQSXNU_9"
/product="hypothetical protein"
/locus_tag="MKOBIG_00320"
/protein_id="gnl|Bakta|MKOBIG_00320"
/translation="MAAPTPEELVSQMASRGMTITTTDASGILCLVASISECLELNYPN
DECRQNAIMLWASILISANTAGRYVTSQSAPSGASQSFAYGSKPWVALYNQMKLLDSAG
CTGDLVEDPDGSGKPWFAVVRGSKCK"
/codon_start=1
/transl_table=11
/inference="ab initio prediction:Prodigal:2.6"
/inference="similar to AA
sequence:UniRef:UniRef90_A0A1B1W265"
I am currently running bakta with this line within a docker. bakta --db $bakta_db/ --protein $faa_input_bakta --skip-trna --skip-tmrna --skip-rrna --skip-ncrna --skip-ncrna-region --skip-crispr --skip-pseudo --skip-gap --skip-ori --skip-plot --output ${assembly_input_bakta.simpleName}_bakta/ --threads ${params.threads} $assembly_input_bakta
The text was updated successfully, but these errors were encountered:
Hi, thanks for reaching out. To make sure that I correctly understand what you're finally trying to achieve: you would like to annotate a phage genome sequence with Bakta using a user-provided proteins file with functional annotations from Phanotate? Is this correct?
Hmm, in principle, you can do this. However, Bakta was designed to annotate bacterial genomes, hence the overlap filters. I could add an option to deactivate all overlap filters in the next release. But I cannot make any promises when this will be. Meanwhile, you could try pharokka?
Hey @Daniel-Tichy , I just added a new --skip-filter option to Bakta which is now available in the main branch, and will be public with the upcoming v1.10.0, soon.
I hope this fits your needs in this case. I'll close this for now. If there are any further comments, ideas, suggestions, please do not hesitate to re-open this (or a new one). Thanks again an best regards!
The issue is related to the user-provided proteins feature and its associated issues.
I am trying to use bakta to perform annotation on a phage predicted protein file that used Phanotate. I was expecting an annotation to every protein of my input file but it seems that overlapped proteins are being filtered by bakta.
-I would like to deactivate the overlap detection so bakta does not filter the previously predicted proteins that I am using as input.
Example: this is my input gbk for bakta.
I parse it and input it in the following format to bakta.
But I get this output, the protein for WARQSXNU_10 is missing probably because of the overlap in the genome.
I am currently running bakta with this line within a docker.
bakta --db $bakta_db/ --protein $faa_input_bakta --skip-trna --skip-tmrna --skip-rrna --skip-ncrna --skip-ncrna-region --skip-crispr --skip-pseudo --skip-gap --skip-ori --skip-plot --output ${assembly_input_bakta.simpleName}_bakta/ --threads ${params.threads} $assembly_input_bakta
The text was updated successfully, but these errors were encountered: