Merge #1511

bors[bot] · darsnack · web-flow · commit ea41ea64e09c · 2021-02-19T04:10:53.000Z
1511: Add ParameterSchedulers.jl to docs r=darsnack a=darsnack As per the discussion in #1506, this adds a section to the docs that briefly describes scheduling with ParameterSchedulers.jl. Only concerns before merging are the naming conventions in ParameterSchedulers.jl. If I could get some feedback on those, then I can submit a minor release before officially merging this into the Flux docs. ### PR Checklist - [ ] ~~Tests are added~~ - [ ] ~~Entry in NEWS.md~~ - [x] Documentation, if applicable - [ ] ~~Final review from `@dhairyagandhi96` (for API changes).~~ Co-authored-by: Kyle Daruwalla <daruwalla@wisc.edu>
diff --git a/docs/src/ecosystem.md b/docs/src/ecosystem.md
@@ -16,5 +16,6 @@ machine learning and deep learning workflows:
 - [Parameters.jl](https://github.com/mauro3/Parameters.jl): types with default field values, keyword constructors and (un-)pack macros
 - [ProgressMeters.jl](https://github.com/timholy/ProgressMeter.jl): progress meters for long-running computations
 - [TensorBoardLogger.jl](https://github.com/PhilipVinc/TensorBoardLogger.jl): easy peasy logging to [tensorboard](https://www.tensorflow.org/tensorboard) in Julia
+- [ParameterSchedulers.jl](https://github.com/darsnack/ParameterSchedulers.jl): standard scheduling policies for machine learning
 
 This tight integration among Julia packages is shown in some of the examples in the [model-zoo](https://github.com/FluxML/model-zoo) repository.
diff --git a/docs/src/training/optimisers.md b/docs/src/training/optimisers.md
@@ -137,6 +137,36 @@ In this manner it is possible to compose optimisers for some added flexibility.
 Flux.Optimise.Optimiser
 ```
 
+## Scheduling Optimisers
+
+In practice, it is fairly common to schedule the learning rate of an optimiser to obtain faster convergence. There are a variety of popular scheduling policies, and you can find implementations of them in [ParameterSchedulers.jl](https://darsnack.github.io/ParameterSchedulers.jl/dev/README.html). The documentation for ParameterSchedulers.jl provides a more detailed overview of the different scheduling policies, and how to use them with Flux optimizers. Below, we provide a brief snippet illustrating a [cosine annealing](https://arxiv.org/pdf/1608.03983.pdf) schedule with a momentum optimiser.
+
+First, we import ParameterSchedulers.jl and initalize a cosine annealing schedule to varying the learning rate between `1e-4` and `1e-2` every 10 steps. We also create a new [`Momentum`](@ref) optimiser.
+```julia
+using ParameterSchedulers
+
+schedule = ScheduleIterator(Cos(λ0 = 1e-4, λ1 = 1e-2, period = 10))
+opt = Momentum()
+```
+
+Next, you can use your schedule directly in a `for`-loop:
+```julia
+for epoch in 1:100
+  opt.eta = next!(schedule)
+  # your training code here
+end
+```
+
+`schedule` can also be indexed (e.g. `schedule[100]`) or iterated like any iterator in Julia:
+```julia
+for (eta, epoch) in zip(schedule, 1:100)
+  opt.eta = eta
+  # your training code here
+end
+```
+
+ParameterSchedulers.jl allows for many more scheduling policies including arbitrary functions, looping any function with a given period, or sequences of many schedules. See the ParameterSchedulers.jl documentation for more info.
+
 ## Decays
 
 Similar to optimisers, Flux also defines some simple decays that can be used in conjunction with other optimisers, or standalone.