Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Allow two (same/different) Batch objs to be tested for equality #1098

Merged
merged 12 commits into from
Apr 16, 2024

Conversation

dantp-ai
Copy link
Contributor

@dantp-ai dantp-ai commented Apr 4, 2024

Closes: #1086

Api Extensions

Breaking Changes

tianshou/data/batch.py Outdated Show resolved Hide resolved
tianshou/data/batch.py Outdated Show resolved Hide resolved
tianshou/data/batch.py Outdated Show resolved Hide resolved
test/base/test_batch.py Outdated Show resolved Hide resolved
dantp-ai added 5 commits April 7, 2024 16:14
  * Introduce a new test class named `TestBatchEquality`.
  * Got rid of comments and assert messages as the test method names are expressive enough.
  * Add tests for `Batch.to_dict()`
  * Add deepdiff library (for testing equality of dicts in the tests mentioned above)
    * Update poetry.toml and poetry.lock
  * Note: `Batch.to_numpy()` should be extended to support also a non in-place operation.
if not isinstance(other, self.__class__):
return False

self.to_numpy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This modifies self and other inplace on a supposedly harmless equality check - this should never happen.

Starting a new thread about the questions you raised regarding to_numpy.

  1. For other methods we have the convention that things ending in _ are inplace. So it should really be that to_numpy_ operates inplace and returns nothing, whereas to_numpy() doesn't change the original and returns a new batch. That's a breaking change, so you'd need to check all the places where to_numpy is currently being used.
  2. I think the name to_numpy is ok for the moment, since all values of batch are either batches, tensors or arrays, right? Then it's clear that to_numpy would turn all tensors into arrays.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch.to_torch() has same problem because it is also in-place. We can change this in another PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for bringing it up! We should generally reduce the number of inplace operations as much as possible

tianshou/data/batch.py Outdated Show resolved Hide resolved
  * Breaking change: Previous in-place `Batch.to_numpy` is now `Batch.to_numpy_` (following naming convention of other in-place methods).
  * Update places where in-place was expected
  * Add tests for both to_numpy/to_numpy_
tianshou/data/batch.py Outdated Show resolved Hide resolved
@MischaPanch MischaPanch merged commit ca4f74f into thu-ml:master Apr 16, 2024
4 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Batch: don't create new objects on getitem
2 participants