-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Batch: don't create new objects on getitem #1086
Comments
I can look into this. I think the example above has a small typo. It should be |
Yes, you're right about the typo. From all batch issues this might be the hardest one. I'm not sure how it can be solved at all, tbh. |
Interesting:
|
Even more confusing, since for batches with only subbatches getitem does work as expected, but if a sequence is involved it creates a new object: b = Batch(a=[1, 2, 3])
b[0] == b[0]
>>> False |
Note that if there is a solution, it should also work for slices. Right now b[:2] == b[:2]
>>> False One idea: we likely can't make it return the same object, but we could add |
Yes, seems to be quite involving at this point. I wonder how
Yes, this sounds good. I'll try this out. I don't think it would hurt later if we do find a solution for the object equality. |
As I found out just now, python's own list actually cannot do this, so
Since |
Huh, actually, I was slightly wrong but in a weird way. There seems some magic happening when a var is assigned to id of a list view.. Anyhow, the id of python list slices is not completely fixed |
Closes: #1086 ### Api Extensions - Batch received new method: `to_numpy_`. #1098 - `to_dict` in Batch supports also non-recursive conversion. #1098 - Batch `__eq__` now implemented, semantic equality check of batches is now possible. #1098 ### Breaking Changes - The method `to_numpy` in `data.utils.batch.Batch` is not in-place anymore. Instead, a new method `to_numpy_` does the conversion in-place. #1098
For reference: the objects returned on getitem still have different ids. This issue was resolved by implementing |
I just had the case where I wanted to compare two batches that contained torch distributions logged during the training process. This comparison fails with a |
Thx for spotting it! It should indeed work. There are some tests that cover this, but as I was digging into it I noticed that it fails for some other cases, e.g:
I will look into it asap. I apologize for the inconvenience. EDIT:
|
@maxhuettenrauch So far it seems that the issue is when dealing with zero-dimensional arrays. To remain flexible wrt to DeepDiff's, I suggest that we perform an additional processing step in |
In the last months I implemented a lot of helper things that also could help with this issue. Gonna open a PR tomorrow and assign you two as reviewers |
@MischaPanch Should I go ahead with the proposal above? Or does one of your helper methods already cover this edge case? |
@MischaPanch I experimented today with the new Batch API (#1181), specifically |
Currently
Batch.__getitem__
will always create a new object. This is counterintuitive and destroys equality checks. E.g.,will result in
id1 != id2
, which leads tob[0] == b[0]
beingFalse
Related to #922
The text was updated successfully, but these errors were encountered: