add gradient clipping to `create_supervised_trainer()` #419

lmarti · 2019-01-30T17:00:12Z

It would be good to add gradient clipping to the trainers created by create_supervised_trainer. This is already provided by torch.nn.utils.clip_grad_norm_.

One possible implementation could be:

import math
from torch.nn.utils import clip_grad_norm_

def create_supervised_trainer(model, optimizer, loss_fn,
                              device=None, non_blocking=False,
                              prepare_batch=_prepare_batch,
                              gradient_clip=math.inf):
    """
    Factory function for creating a trainer for supervised models.
    Args:
        model (`torch.nn.Module`): the model to train.
        optimizer (`torch.optim.Optimizer`): the optimizer to use.
        loss_fn (torch.nn loss function): the loss function to use.
        device (str, optional): device type specification (default: None).
            Applies to both model and batches.
        non_blocking (bool, optional): if True and this copy is between CPU and GPU, the copy may occur asynchronously
            with respect to the host. For other cases, this argument has no effect.
        prepare_batch (callable, optional): function that receives `batch`, `device`, `non_blocking` and outputs
            tuple of tensors `(batch_x, batch_y)`.
        gradient_clip (float, optional): value to use to clip gradients.
    Note: `engine.state.output` for this engine is the loss of the processed batch.
    Returns:
        Engine: a trainer engine with supervised update function.
    """
    if device:
        model.to(device)

    def _update(engine, batch):
        model.train()
        optimizer.zero_grad()
        x, y = prepare_batch(batch, device=device, non_blocking=non_blocking)
        y_pred = model(x)
        loss = loss_fn(y_pred, y)
        loss.backward()
        clip_grad_norm_(model.parameters(), gradient_clip)
        optimizer.step()
        return loss.item()

    return Engine(_update)

The text was updated successfully, but these errors were encountered:

vfdev-5 · 2019-01-30T17:37:30Z

@lmarti thanks for the feedback. We discussed a similar question in #375.
Methods like create_supervised_trainer are just helper methods for a basic usage, use directly Engine with custom process_fn.

We can discuss whether such trainer could be useful and placed in contrib.engines module.
cc @willprice

lmarti · 2019-01-30T17:54:36Z

Sorry, I missed that one. I had the same doubts w.r.t. moving it to contrib.engines. My point against doing it is that the code would be so similar to the one in create_supervised_trainer. In any case, you are driving here.

AntoinePrv · 2019-02-01T15:08:53Z

A general way to maintain this would be to fire a new event (GRADIENT_COMPUTED?) between loss.backward() and optimizer.step()

Doesn't have to be added into core events, it can just be added for supervised_trainer as we did with supervised_tbptt_trainer.

vfdev-5 · 2019-02-01T16:03:57Z

@AntoinePrv I think it would be more simple to write custom processing function instead of custom events.

sudarshan85 · 2019-04-01T02:06:21Z

@vfdev-5 While I agree with you, it would be nice to have options. In particular, it would be great if we could have more events compared to the fastai callback system. The callbacks listed there are (events in parenthesis):

on_train_begin() (Events.STARTED)
on_epoch_begin() (Events.EPOCH_STARTED)
on_batch_begin() (Events.ITERATION_STARTED)
on_loss_begin()*: Called after forward pass but before loss has been computed.
on_backward_begin()*: Called after forward pass and loss computation but before backprop.
on_backward_end()*: Called after backprop but before optimizer step.
on_step_end()*: Called after optimizer step but before gradients are zeroed.
on_batch_end() (Events.ITERATION_COMPLETED)
on_epoch_end() (Events.EPOCH_COMPLETED)
on_train_end() (Events.COMPLETED)

these fastai callbacks have not corresponding ignite events. Having these as options provides the following advantages:

It adds even more flexibility to the engine
A lot of fastai's callbacks are utilized to provide tips and other advantages such as LRFinder, gradient clipping etc. It would be easy to port over those if we have these events.

vfdev-5 · 2019-04-01T06:48:25Z

@sudarshan85 we can think about to provide a generic callback class into contrib module.
But I hardly imagine a class that uses all these on_* methods. The example you cited, LRFinder implements just 3 methods: on_train_begin, on_batch_end, on_train_end. This is very similar to the behaviour of our classes with attach method = handle 2-3 events of the Engine: Metric, ProgressBar etc.

TilakSanghvi · 2024-10-15T21:00:47Z

@lmarti I am interested in this issue and would like to contribute in this issue. Please assign me this issue.

vfdev-5 · 2024-10-15T21:43:14Z

@TilakSanghvi you can start from this PR : #1693 and add tests

vfdev-5 added the question label Apr 1, 2019

vfdev-5 mentioned this issue Aug 14, 2020

Add more options to create_supervised_trainer #1235

Closed

ydcjeff mentioned this issue Feb 21, 2021

add gradient normalization and accumulation in supervised_training_step_* functions #1662

Closed

This was referenced Feb 22, 2021

Add gradient clipping to create_supervised_trainer() #1681

Closed

add gradient clipping to create_supervised_trainer_* #1693

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add gradient clipping to `create_supervised_trainer()` #419

add gradient clipping to `create_supervised_trainer()` #419

lmarti commented Jan 30, 2019

vfdev-5 commented Jan 30, 2019 •

edited

Loading

lmarti commented Jan 30, 2019

AntoinePrv commented Feb 1, 2019 •

edited

Loading

vfdev-5 commented Feb 1, 2019 •

edited

Loading

sudarshan85 commented Apr 1, 2019

vfdev-5 commented Apr 1, 2019 •

edited

Loading

TilakSanghvi commented Oct 15, 2024

vfdev-5 commented Oct 15, 2024

add gradient clipping to create_supervised_trainer() #419

add gradient clipping to create_supervised_trainer() #419

Comments

lmarti commented Jan 30, 2019

vfdev-5 commented Jan 30, 2019 • edited Loading

lmarti commented Jan 30, 2019

AntoinePrv commented Feb 1, 2019 • edited Loading

vfdev-5 commented Feb 1, 2019 • edited Loading

sudarshan85 commented Apr 1, 2019

vfdev-5 commented Apr 1, 2019 • edited Loading

TilakSanghvi commented Oct 15, 2024

vfdev-5 commented Oct 15, 2024

add gradient clipping to `create_supervised_trainer()` #419

add gradient clipping to `create_supervised_trainer()` #419

vfdev-5 commented Jan 30, 2019 •

edited

Loading

AntoinePrv commented Feb 1, 2019 •

edited

Loading

vfdev-5 commented Feb 1, 2019 •

edited

Loading

vfdev-5 commented Apr 1, 2019 •

edited

Loading