Current `ModuleT` traits do not allow in-place ops #405

lostmsu · 2021-08-15T20:48:26Z

Ideally, they should consume and return Tensor. E.g.

impl Module for InPlaceReLU {
  fn forward(mut xs: Tensor) -> Tensor {
    xs.relu_()
  }
}

It does not work with mutable reference, because one needs to return the original tensor, and you can't get it from the reference (I am unsure if shallow_clone works with autograd. Documentation should explicitly mention that).

impl Module for InPlaceReLU {
  fn forward(&self, xs: &mut Tensor) -> Tensor {
    xs.relu_().shallow_clone()
  }
}

Rant:
The current approach also makes it annoying to implement some modules because you can't do xs = submodule.forward(xs). Which is very painful when the logic to find the first submodule to apply is complicated. Even for Sequential one has to do something like

fn forward(&self, xs: &Tensor) -> Tensor {
  let mut out = self.submodules[0].forward(xs);
  for i in 1..self.submodules.len() {
     out = self.submodules[i].forward(&out);
  }
  out
}

The text was updated successfully, but these errors were encountered:

LaurentMazare · 2021-08-16T00:20:32Z

You mention shallow_clone() and I think this should indeed solves most of the issue you mention. This is a very lightweight operation as it only consists in getting a new ref counted reference on a tensor and should work well with autograd.
For your Sequential approach this should let you do something like the following (or a fold).

fn forward(&self, xs: &Tensor) -> Tensor {
  let mut out = xs.shallow_clone();
  for submodule in self.submodules.iter() {
     out = submodule.forward(&out);
  }
  out
}

lostmsu · 2021-08-16T00:47:07Z

@LaurentMazare I understand that shallow_clone does the trick. But it also makes code misleading in the in-place modification scenario, because &Tensor implies inability to modify in-place, and if I do xs.shallow_clone().relu_(), I break the contract, that makes direct xs.relu_() impossible.

If we go deeper into the whole references thing, shallow_clone should not even exist in the first place: you should not be able to pass reference(s) to the same Tensor to two modules, one of which does in-place op.

LaurentMazare · 2021-08-16T01:04:26Z

I agree that this low-level api doesn't reflect aliasing properly, and this has been the case from the start #33 (and this issue is not restricted to mutability).
On that I don't have much better to suggest than trying to write a higher level api that would be following the aliasing rules properly, that's hard to do at the tch level as most of the code is generated automatically and has no clue of the underlying assumptions of the C++ functions. I don't think there has been much attempt at such higher level apis.

lostmsu · 2021-08-16T01:40:57Z

@LaurentMazare that makes sense, and does not seem unlike File.try_clone that also violates all the similar contracts.

lostmsu closed this as completed Aug 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current `ModuleT` traits do not allow in-place ops #405

Current `ModuleT` traits do not allow in-place ops #405

lostmsu commented Aug 15, 2021 •

edited

Loading

LaurentMazare commented Aug 16, 2021

lostmsu commented Aug 16, 2021 •

edited

Loading

LaurentMazare commented Aug 16, 2021

lostmsu commented Aug 16, 2021

Current ModuleT traits do not allow in-place ops #405

Current ModuleT traits do not allow in-place ops #405

Comments

lostmsu commented Aug 15, 2021 • edited Loading

LaurentMazare commented Aug 16, 2021

lostmsu commented Aug 16, 2021 • edited Loading

LaurentMazare commented Aug 16, 2021

lostmsu commented Aug 16, 2021

Current `ModuleT` traits do not allow in-place ops #405

Current `ModuleT` traits do not allow in-place ops #405

lostmsu commented Aug 15, 2021 •

edited

Loading

lostmsu commented Aug 16, 2021 •

edited

Loading