-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Possible UB in Tensor #33
Comments
Thanks for reporting this issue. Indeed |
Great, thanks for the prompt response! |
The code generation has been tweaked to reflect that functions ending with underscores can mutate self. I'm not sure whether this covers all possible scenarios but this should at least improve the current situation. Feel free to re-open if you see further inconsistencies. |
This is slightly unrelated to this issue, but I see that a number of functions return a In general, there are some lifetime relations between
|
You're perfectly right. Again the issue comes from the code being mostly generated from some ops definition file and this file doesn't include much information about what is shared between the different inputs and output. |
In that case could having |
Unsafe seems a bit too much to me in this case: it cannot cause a segfault but there may be some data sharing that is not represented in the type system. I would have thought that the only issues this creates are potential race conditions when using multiple threads - maybe not implementing the |
Hmm it can still cause problems. The following example (adapted from the swap example in this blog post) demonstrates some unexpected issues: // Say we have Tensor t1
let t2 = t1.reshape(t1.shape()); // t2 is now a view of t1
let swap = |x, y| {
*x = *x ^ *y;
*y = *x ^ *y;
*x = *x ^ *y;
};
swap(&mut t1, &mut t2); // Now t1 and t2 are both 0 Unless |
Very interesting example thanks, indeed there is more to it than race conditions. I'm still not sure whether fixing this will be in the scope of this crate or of another layer on top of it, I have to think more about it but meanwhile I reopened this issue so that it's clearer what the current behavior is. |
Also see PyO3/pyo3#342, this chapter in wasm-bindgen, and finally this stackoverflow question |
I'll bump this just to point out the comment I recently added to the PyO3 issue, which is basically that |
The title of this issue may be unnecessarily alarmist; I was certainly alarmed. It’s UB for a Rust reference to alias mutable data, and that would be very bad. But this library doesn’t appear to actually take Rust references to the underlying data inside a |
Linking burn's ticket #235 which uses tch-rs that may have the same issue. So if we have a solution you can look up too. |
@antimora that's very interesting, do you have a small tch example that triggers the UB? I don't think we ever came up with one and it's unclear that it could actually happen but if you have an example that would be very helpful in order to propose a fix. |
The problem arises when using For instance, reshaping a tensor followed by an in-place operation may or may not change the original tensor depending on the shape. So it seems like you have to treat tensors as unsafe since they are not really owners of their data. let tensor = ...
let mut tensor_cloned = tensor.shallow_clone();
tensor_cloned.relu_(); // Changes the first tensor It's not a big problem when you explicitly call I fixed the problem by carefully tracking references to |
Do you have evidence that this is actually undefined behavior, or is it merely a confusing use of interior mutability? You can do confusing things with |
I asked @nathanielsimard to respond to your request since he is the one who fixed a bug on our end. |
@andersk I would say this:
The difference with From the same link:
|
That’s my point though—as far as I’m aware, the data in a Lines 18 to 20 in 123c96c
If this library were to expose an operation that dereferences this raw pointer and returns a Rust reference, that could be used by safe code to trigger undefined behavior. But I don’t see an operation like that. ( |
I'm not qualify enough to say with certainty what is or is not undefined behavior, I'm just pointing out that in-place operations can have surprising side effects, especialy when coupled with operations that may or may not return a newly allocated tensor. I'm also not sure if there is something to do here, LibTorch was not written in Rust and if a view is modified, they may chose to modify the parent tensor as well! I don't see how this could be fixed while still keeping and API as close as possible to the C++ API.
|
Hi,
I'm not super familiar with the design of the library, but I've been trying to use it in a project of mine. In doing so I found that
Tensor::f_copy
mutates a Tensor that's passed as a immutable reference, which is UB in Rust. I think one of the arguments should be changed to&mut
to fix this. Sorry if I've missed some context.Thanks!
The text was updated successfully, but these errors were encountered: