-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
added some flexibility to create your custom benchmark splits #307
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are my comments:
This is an API change for users, ideally it would be best to avoid that. Also the method's behavior is quite different depending on whether task_splits is provided or not, which is a bit confusing IMO. Why not implement a new explicit method, like subset_from_custom_splits()
or subset_from_task_list()
? This would solve both problems.
Otherwise LGTM. I let you decide whether to keep like this or not @optimass :)
if split_column not in self.task_metadata.columns: | ||
raise NotImplementedError( | ||
f"This benchmark does not provide default train/valid/test splits (missing a {repr(split_column)} column in task metadata)" | ||
def subset_from_split( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it would be a much better fit to have two separate function. Instead of a if with two completely separate code.
I would keep the original subset_from_split function untouch and add a
subset_from_task_list(task_names: list[str], benchmark_name_suffix: str)
function.
But technically, you could achieve exactly this with subset_from_regex and pass a long regex. subset_from_task_list could be a wrapper to generate the regex, it might be more convenient than having to specify exactly each task names.
In your code task_splits is only used with task_splits[split], so instead of creating a dict, just directly use a list
added some flexibility to create your custom benchmark splits