Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Missing trait literals are converted to lower case in augur export stderr output #1584

Open
corneliusroemer opened this issue Aug 17, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@corneliusroemer
Copy link
Member

Current Behavior

Augur export converts literals to lower case in its stderr output - which is probably not what we want

Expected behavior

Literals are output as they are, not changing case.

How to reproduce

Steps to reproduce the current behavior:

  1. Include some extra literals in a colors file, including upper case in non-initial position, e.g. 21L
  2. Run augur export with --colors
  3. Observe log message
$ augur export v2             --tree builds/wuhan/tree.nwk             --metadata builds/wuhan/metadata_with_bloom_scores.tsv             --node-data builds/wuhan/branch_lengths.json builds/wuhan/muts.json builds/wuhan/clades_display.json builds/wuhan/clades.json builds/wuhan/clades_nextstrain.json builds/wuhan/clades_who.json builds/wuhan/internal_pango.json             \
  --colors builds/wuhan/colors.tsv             --auspice-config profiles/clades/wuhan/auspice_config.json             --title 'SARS-CoV-2 phylogeny'             --description profiles/clades/description.md             --include-root-sequence-inline             --minify-json             --output auspice/wuhan/auspice_raw.json
        
Validating schema of 'builds/wuhan/muts.json'...
Validating config file profiles/clades/wuhan/auspice_config.json against the JSON schema
Validating schema of 'profiles/clades/wuhan/auspice_config.json'...
WARNING: Requested color-by field 'placement_priors' does not exist and will not be used as a coloring or exported.

WARNING: These values for trait clade_membership were not specified in the colors file you provided:
        21k, 21f, 20f, 20h, 20b, 21m, 19a, 20i, 20c, recombinant, 21e, 21g, 20g, 20a, 21j, 20e, 20j, 21d, 21i, 21a, 20d, 19b, 21b, 21c, 21h.
        Auspice will create colors for them.
WARNING: These values for trait clade_who were not specified in the colors file you provided:
        recombinant.
        Auspice will create colors for them.

WARNING: These values for trait clade_nextstrain were not specified in the colors file you provided:
        21k, 21f, 20f, 20h, 20b, 21m, 19a, 20i, 20c, recombinant, 21e, 21g, 20g, 20a, 21j, 20e, 20j, 21d, 21i, 21a, 20d, 19b, 21b, 21c, 21h.
        Auspice will create colors for them.

Validating produced JSON
Validating schema of 'auspice/wuhan/auspice_raw.json'...
Validating that the JSON is internally consistent...
        WARNING:  The filter "new_node" does not appear as a property on any tree nodes.
Validation of 'auspice/wuhan/auspice_raw.json' succeeded, but there were warnings you may want to resolve.

Note this line:

 21k, 21f, 20f, 20h, 20b, 21m, 19a, 20i, 20c, recombinant, 21e, 21g, 20g, 20a, 21j, 20e, 20j, 21d, 21i, 21a, 20d, 19b, 21b, 21c, 21h.

the input colors were 21K not 21k:

image
@corneliusroemer corneliusroemer added the bug Something isn't working label Aug 17, 2024
@jameshadfield
Copy link
Member

Here's the code behind this - the erroneous console output is a side-effect of the matching being done in lower case:

augur/augur/export_v2.py

Lines 330 to 344 in 988380c

elif key.lower() in provided_colors:
# `provided_colors` typically originates from a colors.tsv file
scale = []
trait_values = {str(val).lower(): val for val in get_values_across_nodes(node_attrs, key)}
trait_values_unseen = {k for k in trait_values}
for provided_key, provided_color in provided_colors[key.lower()]:
if provided_key.lower() in trait_values:
scale.append([trait_values[provided_key.lower()], provided_color])
trait_values_unseen.discard(provided_key.lower())
if len(scale):
coloring["scale"] = scale
if len(trait_values_unseen):
warn(f"These values for trait {key} were not specified in the colors file you provided:\n\t{', '.join(trait_values_unseen)}.\n\tAuspice will create colors for them.")
return coloring
warn(f"You've supplied a colors file with information for {key} but none of the values found on the tree had associated colors. Auspice will generate its own color scale for this trait.")

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants