-
Notifications
You must be signed in to change notification settings - Fork 96
Adding components and refactoring of schedulers #285
Conversation
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ericspod
I'm a fan of the idea in general, it makes sense to factor a lot of the noise-related parts out of each individual scheduler. Questions/thoughts:
-
I think it would be really great if people are able to play with their own custom noise schedules without having to commit them. I've recently been playing with funky noise schedules, but nothing I think would be useful enough to others that I would want to push to the codebase. As it stands I don't think this is possible, right? If I try to add my own custom schedule I need to modify
scheduler.py
? -
A related general thought - at the moment we force noise schedules to be designed by setting beta, and we calculate alpha, alpha_cumprod from beta. It is sometimes useful to define the noise schedule in terms of the other params. For example, if you want a noise schedule that alters the SNR as a function of t, it is easiest to define in terms of alpha_cumprod which is a direct function of the SNR (discussed here). Do you think it would be useful to have some helper functions so users can write noise schedules in terms of alpha or alpha cumprod and we can return the corresponding beta values? (Alpha to beta is easy, going from alpha_cumprod to beta isn't hard but a bit fiddly as you have to reverse the cumulative product).
|
||
|
||
class DDPMPRedictionType(StrEnum): | ||
EPSILON = "epsiolon" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo, should be "epsilon"
Thanks for looking at this @marksgraham! The idea with the What it comes down to is that we need an idea of what is common to all noise schedulers and define our base class with that then specialise others, so consider what to define to make the designs modular and flexible to account for the variations you're describing. |
Hi @ericspod Ah I see- I had tried to add my own scheduler and failed, but it was because I was missing the decorator. Regarding flexibility - I think this should be flexible enough in terms of return type. As I said users will either want to define in terms of alpha, beta, or alpha cumprod, but I think it will be sufficient to provide helper functions to go to alpha -> beta or alpha_cumprod -> beta, so the user can define the noise in whatever set of parameters they want but return beta. We could alternatively require that the noise schedule return alpha, beta, and alpha_cumprod, rather than having the scheduler calculate alpha/alpha cumprod from beta. Regarding arguments - It would be good if we were able to more flexibly allow for other arguments, without forcing the user to leave in an unused beta start/beta end which feels a bit clunky. In terms of whether this is over-engineered: it certainly took me some time to get my head around the code. But I think it will be pretty easy for users to use in practice, if we supply examples. But one thing I'm unclear on is what advantage this offers over allow the user to pass a callable to the scheduler? Tagging @Warvito in case he has any opinions on this. |
Hi @marksgraham, I need to update this PR with this feedback in mind and we'll come around to it again, if beta/alpha/alpha_cumprod are all we need to worry about we can define functions to return these, or if only one value is returned assume this is beta and calculate the others from it as it currently is.
We could do that easily enough, my concern in MONAI and elsewhere is to make explicit what functions are available and give them useful names. We have enumerations for int and string constants in MONAI for this reason as opposed to the way other libraries take of just stating valid values in the docstring. Our layer factories make it easy to choose a layer by name with the dimensionality being parameter, but they also gives clear names to the concepts they store. Explicit is better than implicit, but simple is better than complex so the question is if the tradeoff makes sense here. |
Hi @ericspod
OK, then I think the tradeoff does make sense here! I guess the next step is to update all the other schedulers? In which case I'll hold off carefully reviewing the latest batch of changes until thats done? |
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
Hi @marksgraham, I have things updated finally if you want to take a look. |
I have the conflict from the scheduler addition to fix however. |
Hi @ericspod |
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
Thanks @marksgraham I think it's ready now, I looked through the tutorials and made changes where needed, though yes I haven't rerun them which raises the question of what to do for CI/CD with these. |
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
@Warvito Updates from comments added! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ericspod
This looks good. Just need to run ./runtests.sh --autofix
as it picks up some formatting corrections
Co-authored-by: Mark Graham <markgraham539@gmail.com> Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
I am getting these flake8 errors when running
|
@Warvito I've made those fixes but my version of the tools brought up some other issues, different versions of flake8 keep coming up with new issues not seem before. |
Do we need to settle on a version of flake8 to use then? |
We should always use the current version I think, it just means we'll encounter errors unrelated to our own changes on occasion. |
Addresses #280.
The objective here is define a base class for all schedulers that we can use for reducing duplication of code and type checking.
One thing added is a way of adding new noise schedule functions by adding them to a component store object. This version adds the cosine schedule but others can be added by users in their own scripts.