Skip to content

Commit ea41ea6

Browse files
bors[bot]darsnack
andauthored
Merge #1511
1511: Add ParameterSchedulers.jl to docs r=darsnack a=darsnack As per the discussion in #1506, this adds a section to the docs that briefly describes scheduling with ParameterSchedulers.jl. Only concerns before merging are the naming conventions in ParameterSchedulers.jl. If I could get some feedback on those, then I can submit a minor release before officially merging this into the Flux docs. ### PR Checklist - [ ] ~~Tests are added~~ - [ ] ~~Entry in NEWS.md~~ - [x] Documentation, if applicable - [ ] ~~Final review from `@dhairyagandhi96` (for API changes).~~ Co-authored-by: Kyle Daruwalla <daruwalla@wisc.edu>
2 parents 15a0ebf + 1279ba4 commit ea41ea6

File tree

2 files changed

+31
-0
lines changed

2 files changed

+31
-0
lines changed

docs/src/ecosystem.md

+1
Original file line numberDiff line numberDiff line change
@@ -16,5 +16,6 @@ machine learning and deep learning workflows:
1616
- [Parameters.jl](https://github.com/mauro3/Parameters.jl): types with default field values, keyword constructors and (un-)pack macros
1717
- [ProgressMeters.jl](https://github.com/timholy/ProgressMeter.jl): progress meters for long-running computations
1818
- [TensorBoardLogger.jl](https://github.com/PhilipVinc/TensorBoardLogger.jl): easy peasy logging to [tensorboard](https://www.tensorflow.org/tensorboard) in Julia
19+
- [ParameterSchedulers.jl](https://github.com/darsnack/ParameterSchedulers.jl): standard scheduling policies for machine learning
1920

2021
This tight integration among Julia packages is shown in some of the examples in the [model-zoo](https://github.com/FluxML/model-zoo) repository.

docs/src/training/optimisers.md

+30
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,36 @@ In this manner it is possible to compose optimisers for some added flexibility.
137137
Flux.Optimise.Optimiser
138138
```
139139

140+
## Scheduling Optimisers
141+
142+
In practice, it is fairly common to schedule the learning rate of an optimiser to obtain faster convergence. There are a variety of popular scheduling policies, and you can find implementations of them in [ParameterSchedulers.jl](https://darsnack.github.io/ParameterSchedulers.jl/dev/README.html). The documentation for ParameterSchedulers.jl provides a more detailed overview of the different scheduling policies, and how to use them with Flux optimizers. Below, we provide a brief snippet illustrating a [cosine annealing](https://arxiv.org/pdf/1608.03983.pdf) schedule with a momentum optimiser.
143+
144+
First, we import ParameterSchedulers.jl and initalize a cosine annealing schedule to varying the learning rate between `1e-4` and `1e-2` every 10 steps. We also create a new [`Momentum`](@ref) optimiser.
145+
```julia
146+
using ParameterSchedulers
147+
148+
schedule = ScheduleIterator(Cos(λ0 = 1e-4, λ1 = 1e-2, period = 10))
149+
opt = Momentum()
150+
```
151+
152+
Next, you can use your schedule directly in a `for`-loop:
153+
```julia
154+
for epoch in 1:100
155+
opt.eta = next!(schedule)
156+
# your training code here
157+
end
158+
```
159+
160+
`schedule` can also be indexed (e.g. `schedule[100]`) or iterated like any iterator in Julia:
161+
```julia
162+
for (eta, epoch) in zip(schedule, 1:100)
163+
opt.eta = eta
164+
# your training code here
165+
end
166+
```
167+
168+
ParameterSchedulers.jl allows for many more scheduling policies including arbitrary functions, looping any function with a given period, or sequences of many schedules. See the ParameterSchedulers.jl documentation for more info.
169+
140170
## Decays
141171

142172
Similar to optimisers, Flux also defines some simple decays that can be used in conjunction with other optimisers, or standalone.

0 commit comments

Comments
 (0)