Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Current ModuleT traits do not allow in-place ops #405

Closed
lostmsu opened this issue Aug 15, 2021 · 4 comments
Closed

Current ModuleT traits do not allow in-place ops #405

lostmsu opened this issue Aug 15, 2021 · 4 comments

Comments

@lostmsu
Copy link

lostmsu commented Aug 15, 2021

Ideally, they should consume and return Tensor. E.g.

impl Module for InPlaceReLU {
  fn forward(mut xs: Tensor) -> Tensor {
    xs.relu_()
  }
}

It does not work with mutable reference, because one needs to return the original tensor, and you can't get it from the reference (I am unsure if shallow_clone works with autograd. Documentation should explicitly mention that).

impl Module for InPlaceReLU {
  fn forward(&self, xs: &mut Tensor) -> Tensor {
    xs.relu_().shallow_clone()
  }
}

Rant:
The current approach also makes it annoying to implement some modules because you can't do xs = submodule.forward(xs). Which is very painful when the logic to find the first submodule to apply is complicated. Even for Sequential one has to do something like

fn forward(&self, xs: &Tensor) -> Tensor {
  let mut out = self.submodules[0].forward(xs);
  for i in 1..self.submodules.len() {
     out = self.submodules[i].forward(&out);
  }
  out
}
@LaurentMazare
Copy link
Owner

You mention shallow_clone() and I think this should indeed solves most of the issue you mention. This is a very lightweight operation as it only consists in getting a new ref counted reference on a tensor and should work well with autograd.
For your Sequential approach this should let you do something like the following (or a fold).

fn forward(&self, xs: &Tensor) -> Tensor {
  let mut out = xs.shallow_clone();
  for submodule in self.submodules.iter() {
     out = submodule.forward(&out);
  }
  out
}

@lostmsu
Copy link
Author

lostmsu commented Aug 16, 2021

@LaurentMazare I understand that shallow_clone does the trick. But it also makes code misleading in the in-place modification scenario, because &Tensor implies inability to modify in-place, and if I do xs.shallow_clone().relu_(), I break the contract, that makes direct xs.relu_() impossible.

If we go deeper into the whole references thing, shallow_clone should not even exist in the first place: you should not be able to pass reference(s) to the same Tensor to two modules, one of which does in-place op.

@LaurentMazare
Copy link
Owner

I agree that this low-level api doesn't reflect aliasing properly, and this has been the case from the start #33 (and this issue is not restricted to mutability).
On that I don't have much better to suggest than trying to write a higher level api that would be following the aliasing rules properly, that's hard to do at the tch level as most of the code is generated automatically and has no clue of the underlying assumptions of the C++ functions. I don't think there has been much attempt at such higher level apis.

@lostmsu
Copy link
Author

lostmsu commented Aug 16, 2021

@LaurentMazare that makes sense, and does not seem unlike File.try_clone that also violates all the similar contracts.

@lostmsu lostmsu closed this as completed Aug 16, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants