Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Feature proposal: gto show settings controlled by .gto config #428

Open
tibor-mach opened this issue Oct 17, 2023 · 2 comments
Open

Feature proposal: gto show settings controlled by .gto config #428

tibor-mach opened this issue Oct 17, 2023 · 2 comments
Labels
stages Stages mechanics and how they work

Comments

@tibor-mach
Copy link

One of the most powerful features of GTO is the ability to audit model lifecycle and have its history tightly coupled with your git repository.

But the result of GTO is in a sense not completely immutable right now since changing the parameters of gto show can lead to different interpretations of the model lifecycle history.

In studio we only allow the default now and most users will probably use that but I think it would be good if we allowed users to make the default explicit and also to be able to switch to a non-default (with potentially multiple models per stage) and make it explicit.

The use-case is probably best illustrated with the following two images:

image
image

Both show a simple git history with two model versions. In the default GTO settings they produce the same history as in the following picture, but in the non-default only the second one corresponds to this history.
image

This creates some uncertainty in auditing. What did the author of the repository intend? In an organization I might want to make sure everyone follows the same standard.

So I propose to have gto show first check .gto for settings. Then you can make it explicit to everyone which setup is used. If nothing is specified in .gto I would still fall back to the current default. If you specify parameters manually when running gto show then the .gto config would be ignored. But in case you are doing an audit and want to really make sure things are done in a standardized way, you would be able to not just specify standard stages in .gto but also a standard "model registry mode".

At the same time I think the default is reasonable enough and most people won't even know about there being alternatives so I think this is a nice to have feature. Still, if you think it is not a waste of time I would try take it and create a contribution (although a slow moving one probably, given the relatively low priority)

@tibor-mach tibor-mach added the stages Stages mechanics and how they work label Oct 17, 2023
@shcheklein
Copy link
Member

@tibor-mach what is the high level scenario that we are trying to solve here - what is the sequence of actions that leads to a confusing results in Studio? Or is it about allowing multiple models at the same time in the same stage?

@tibor-mach
Copy link
Author

@shcheklein I am thinking mostly about auditability here.

Currently there are no persistent ways of making GTO behave one way or another. By default it is assumed that there cannot be more than a single version assigned to a stage at the same time. But the default is only materialised in Studio, otherwise it depends on how you call gto show.

I am imagining a scenario where you have multiple projects in an organisation, some of the projects decide to organise the model registry in a way where multiple model versions can be assigned to the same stage at the same time (it might even make sense for some use-cases), others use the default.

Without explicit stage un-assignments it is not clear from git whether the defaults are used or whether people actually intended to have more models in the same stage. You basically have to ask them.

If there was an option in .gto to specify how gto show works, then specifying this settings could be enforced across the organisation and you could always tell from that config that this and that repo uses the approach with multiple models per stage. You could then also have Studio react to that and visualise things in accord.

Like I said, I think it is a minor issue but it makes the gto model registry slightly ambiguous in these scenarios.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
stages Stages mechanics and how they work
Projects
None yet
Development

No branches or pull requests

2 participants