-
Notifications
You must be signed in to change notification settings - Fork 823
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[TRIS-722] Custom Tagging #2144
base: master
Are you sure you want to change the base?
Conversation
2810093
to
0d45ad5
Compare
0d45ad5
to
4c5cebf
Compare
TestingSetup: Environment variable:
Step functions create command:
Flow:
Output/Result: Additional notes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notes for testing, the config variable
METAFLOW_BATCH_EMIT_TAGS=True
needs to be set in order for tags to have any effect (even validation).
For a better UX, there could be a check somewhere in the path that if aws tags are being set, but emit tags is set to false, we print a warning on this.
Validation could also maybe be run despite the emit_tags feature flag, in order to catch issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add similar logic that is present in f.ex. the kubernetes_decorator.py#runtime_step_cli
which json.dumps some attributes to the cli_args.commands
, namely the new aws_tags
in this case.
otherwise the passed in value will be not handled correctly in the CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example replacing the cli_args.commands.update(self.attributes)
with the following should work:
# dictionaries need to be json dumped for the CLI
json_types = ["aws_tags"]
for k, v in self.attributes.items():
if k in json_types:
cli_args.command[k] = json.dumps(v)
else:
cli_args.command_options[k] = v
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also for the batch_cli.py
side, the --aws_tags
option needs to be
@click.option("--aws-tags", multiple=False, type=JSONTypeClass(), default=None, help="AWS tags.")
in order to convert types correctly. note the multiple=False also, was there a reason to have the plural option act as multiple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We wanted it to be a multiple arg so --aws-tags can be used like: --aws-tags "hello=world" --aws-tags "something=else"
This makes it a bit easier to generate the command based on specs. But I'm open to changing if it makes it easier to setup.
metaflow/plugins/aws/batch/batch.py
Outdated
|
||
if not isinstance(BATCH_DEFAULT_TAGS, dict): | ||
raise BatchException( | ||
"The BATCH_DEFAULT_TAGS config option must be a dictionary of key-value tags." | ||
) | ||
|
||
for name, value in BATCH_DEFAULT_TAGS.items(): | ||
aws_tag = {'key': name, 'value': value} | ||
validate_aws_tag(aws_tag) | ||
job.tag(name, value) | ||
|
||
if step_function_tags is not None: | ||
aws_tags_list = [] | ||
for tag in step_function_tags: | ||
key_value = tag.split("=", 1) | ||
if len(key_value) == 2: | ||
aws_tags_list.append({ | ||
'key': key_value[0], | ||
'value': key_value[1] | ||
}) | ||
for tag in aws_tags_list: | ||
validate_aws_tag(tag) | ||
job.tag(tag['key'], tag['value']) | ||
|
||
if aws_tags is not None: | ||
if not isinstance(aws_tags, dict): | ||
raise BatchException("tags must be a dictionary of key-value tags.") | ||
decorator_aws_tags_list = [ | ||
{'key': key, | ||
'value': val} for key, val in aws_tags.items() | ||
] | ||
for tag in decorator_aws_tags_list: | ||
validate_aws_tag(tag) | ||
job.tag(tag['key'], tag['value']) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't test this part thoroughly, but there is one significant caveat with running validations in the batch setup phase, namely: when validation fails, the metadata for the run/step/task have already been registered in this case, leading to a failed task with no logs.
An alternative approach with a better UX would be to perform batch tag validation inside the batch_decorator.py#step_init
, which will happen before job submission. This allows for erroring out fairly quickly due to malformed tags.
Is there any issue with moving the validation and the BATCH_DEFAULT_TAGS
usage into the decorator completely?
also can you verify after the changes that the following flows run as expected? both with batch, and step-functions expected to fail from metaflow import step, FlowSpec, batch
class BatchTags(FlowSpec):
@batch(aws_tags={"inv@lid": "t@g"})
@step
def start(self):
print("Starting 👋")
self.next(self.end)
@step
def end(self):
print("Done! 🏁")
if __name__ == "__main__":
BatchTags() expected to run from metaflow import (
FlowSpec,
step,
batch,
)
custom_step_tags = {
"goodbye": "world",
"hello": "universe",
}
class BatchTags2(FlowSpec):
@step
def start(self):
self.next(self.hello)
@batch(aws_tags=custom_step_tags)
@step
def hello(self):
self.next(self.end)
@step
def end(self):
pass
if __name__ == "__main__":
BatchTags2() |
if self.attributes["aws_tags"] is not None: | ||
if not isinstance(self.attributes["aws_tags"], dict) and not all(isinstance(k, str) and isinstance(v, str) for k, v in self.attributes["aws_tags"].items()): | ||
raise BatchException("aws_tags must be Dict[str, str]") | ||
decorator_aws_tags_list = [ | ||
{'key': key, | ||
'value': val} for key, val in self.attributes["aws_tags"].items() | ||
] | ||
for tag in decorator_aws_tags_list: | ||
validate_aws_tag(tag) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move these under step_init
instead to get cleaner formatting for exceptions. Unless there is some restriction that requires them to be in __init__
from ..aws_utils import ( | ||
compute_resource_attributes, | ||
get_docker_registry, | ||
get_ec2_instance_metadata, | ||
) | ||
from .batch import BatchException | ||
from metaflow.tagging_util import validate_tags, validate_aws_tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_tags
is unused.
metaflow/plugins/aws/batch/batch.py
Outdated
for name, value in BATCH_DEFAULT_TAGS.items(): | ||
aws_tag = {'key': name, 'value': value} | ||
validate_aws_tag(aws_tag) | ||
job.tag(name, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merging the default values from BATCH_DEFAULT_TAGS
could be moved into the batch_decorator. This way we have a single source to read tags from, and validation can be performed there as well.
metaflow/plugins/aws/batch/batch.py
Outdated
aws_tags=None, | ||
step_function_aws_tags=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the step_function_aws_tags
? as a comparison, the way custom labels/annotations are implemented for argo workflows&kubernetes we made a choice to treat the kubernetes decorator as the single source of truth.
right now it seems that step-functions create --aws-tags "test=tag"
will only apply the tags to the batch jobs spun up by the state machine. There already exists a mechanism for applying tags to those without the new CLI option though, so keeping things simple might be appropriate at this point.
# tagging all steps
python test.py --with batch:aws_tags='{"test":"tag"}' step-functions create
# or set the environment variable
METAFLOWBATCH_DEFAULT_TAGS='{"envtag":"testvalue"}' python test.py step-functions create
# tagging a specific step, simply add a batch deco to the step before deploying
@batch(aws_tags={"tracking":"test_step"})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing the step-function level --aws-tags
option would reduce possible confusion about the scope of the option. One might think that it would apply tags to the state machine as well, which at the moment is not the case.
My suggestion is that we should keep this PR scoped to tagging batch jobs and revisit the need to tag other AWS resources as required, unless you already have a need for that?
Candidates for tagging would include the state machine, and eventbridge rules, but deciding how to configure these on the Metaflow side is nontrivial
bdf3930
to
2040a7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened a stacked PR in your fork to address some of the low hanging fruit. zendesk#8
@@ -71,6 +73,9 @@ class BatchDecorator(StepDecorator): | |||
A swappiness value of 0 causes swapping not to happen unless absolutely | |||
necessary. A swappiness value of 100 causes pages to be swapped very | |||
aggressively. Accepted values are whole numbers between 0 and 100. | |||
aws_batch_tags: Dict[str, str], optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calling this aws_batch_tags
seems a bit excessive given that we're inside the Batch decorator. Could it simply be aws_tags
as before, or was there some issue with that?
self.attributes["aws_batch_tags"] = BATCH_DEFAULT_TAGS | ||
|
||
if self.attributes["aws_batch_tags"] is not None: | ||
if not isinstance(self.attributes["aws_tags"], dict) and not all(isinstance(k, str) and isinstance(v, str) for k, v in self.attributes["aws_batch_tags"].items()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be "aws_batch_tags" in the check. now its throwing an error.
raise BatchException("aws_batch_tags must be Dict[str, str]") | ||
|
||
batch_default_tags_copy = BATCH_DEFAULT_TAGS.copy() | ||
self.attributes["aws_batch_tags"] = batch_default_tags_copy.update(self.attributes["aws_batch_tags"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is leading to the aws_batch_tags being None
in the default case and failing in the following dict comprehension. Can it be simplified into
self.attributes["aws_batch_tags"] = {
**BATCH_DEFAULT_TAGS,
**self.attributes["aws_batch_tags"]
}
pending ordering. Should env take precedence over decorator values, or the other way around?
batch_default_tags_copy = BATCH_DEFAULT_TAGS.copy() | ||
self.attributes["aws_batch_tags"] = batch_default_tags_copy.update(self.attributes["aws_batch_tags"]) | ||
|
||
decorator_aws_tags_list = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this conversion needed? would a dict not suffice?
@@ -179,6 +185,26 @@ def __init__(self, attributes=None, statically_defined=False): | |||
if self.attributes["trainium"] is not None: | |||
self.attributes["inferentia"] = self.attributes["trainium"] | |||
|
|||
if not isinstance(BATCH_DEFAULT_TAGS, dict) and not all(instance(k, str) and isinstance(v, str) for k, v in BATCH_DEFAULT_TAGS.items()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isinstance
, not instance
@@ -29,6 +30,12 @@ def is_utf8_decodable(x): | |||
# How long may an individual tag value be | |||
MAX_TAG_SIZE = 500 | |||
|
|||
def validate_aws_tag(tag): | |||
validate_tags([tag['key'], tag['value']]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_tags
is for metaflow specific metadata tags validation. Some of the rules, like max tag set size probably do not apply to aws tags. Does it make sense to reuse the metaflow tag validation here, or can we do without?
It makes more sense to me that the AWS tag validation be kept completely separate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this the reason that the tags were turned into a more complex type?
ddcfefb
to
49793ff
Compare
This reverts commit 65fd888.
f3da850
to
b42e360
Compare
b42e360
to
1442ecf
Compare
…ing (RikishK: Edited, only taking the decorator changes from here )
a6faf9a
to
e791a6d
Compare
e791a6d
to
7b89b81
Compare
61ac55b
to
25eae34
Compare
25eae34
to
475e282
Compare
TestingFlow:Here we set the decorator level tags. NOTE: the invalid tag is commented out for this test
Step functions creation commandHere we set the step functions cli level tags
Env vars tagsHere we set the environment variables level tags
ExpectationsWe should see env vars tags and step functions cli tags being set on batch jobs. Batch jobs with decorator tags will apply their decorator tags last and override env vars and step functions cli tags. Output Batch JobsStart Step:Hello StepGoodbye stepEnd step |
Testing Invalid TagsTesting with the same flow as previous test but with the invalid tag enabled. Flow
Output of trying to deploy stepfunction:
|
Love this PR – we could really benefit from it. Is there anything we can do to help get it merged? ❤️ |
@@ -97,6 +98,12 @@ def step_functions(obj, name=None): | |||
"with the given tag. You can specify this option multiple " | |||
"times to attach multiple tags.", | |||
) | |||
@click.option( | |||
"--aws-batch-tags", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are to keep this, the flag itself should be singular in my opinion as it is used to set one tag at a time
--aws-batch-tag some=tag --aws-batch-tag another=tag
Options to add custom tags are in the order:
environment variable dictionary,
step functions cli create flag,
batch decorator variable dictionary
This PR builds ontop of this PR: #1628 - this original PR did not work when testing but I would like to credit it here as some of the changes are still used.