Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Support for 0-Dimensional Tensors in Burn #1689

Open
antimora opened this issue Apr 24, 2024 · 6 comments
Open

Support for 0-Dimensional Tensors in Burn #1689

antimora opened this issue Apr 24, 2024 · 6 comments
Labels
enhancement Enhance existing features

Comments

@antimora
Copy link
Collaborator

Motivation

Currently, the Burn deep learning framework in Rust lacks support for 0-dimensional tensors (scalars). Adding support for 0-dimensional tensors would enhance the framework's capabilities and provide several benefits:

  1. Completeness: Supporting 0-dimensional tensors would make Burn more complete and consistent with other deep learning frameworks that already support scalars. Notably, ONNX (Open Neural Network Exchange) format, widely used for interoperability between frameworks, often deals with 0-dimensional tensors.

  2. Simplified Operations: Many deep learning operations, such as loss functions and regularization terms, often involve scalar values. Loss functions, in particular, are crucial for training models and are typically represented as 0-dimensional tensors. Having native support for 0-dimensional tensors would simplify the implementation and usage of such operations, making it easier to compute and optimize losses during training.

  3. Interoperability: Seamless integration with other libraries and frameworks that utilize 0-dimensional tensors would be improved, enabling smoother interoperability and data exchange. This is particularly important when working with ONNX models that frequently incorporate 0-dimensional tensors.

  4. Reduced Workarounds: Without 0-dimensional tensor support, users may need to resort to workarounds like using 1-dimensional tensors with a single element, which can be less intuitive and efficient.

  5. Avoiding Unnecessary Data Copying: By supporting 0-dimensional tensors directly, Burn can avoid unnecessary data copying from the device (e.g., GPU) to the host (CPU) and vice versa. This can lead to improved performance and reduced memory overhead, especially when dealing with large-scale models and datasets.

Proposed Solution

To address this limitation, we propose the following:

  1. Extend the Tensor struct in Burn to support 0-dimensional tensors.
  2. Implement necessary methods and traits for creating, manipulating, and operating on 0-dimensional tensors.
  3. Update relevant functions and operations to handle 0-dimensional tensors correctly, with a focus on loss computation and optimization.
  4. Ensure proper broadcasting and type promotion rules are followed when mixing 0-dimensional tensors with higher-dimensional tensors.
  5. Add comprehensive unit tests to verify the correctness and consistency of 0-dimensional tensor support, including tests specifically related to loss functions.
  6. Update the documentation and examples to showcase the usage of 0-dimensional tensors, particularly in the context of loss computation and ONNX interoperability.

Benefits

By implementing support for 0-dimensional tensors, Burn will:

  • Provide a more complete and consistent API for tensor operations, aligning with ONNX and other frameworks.
  • Simplify the implementation of common deep learning operations involving scalars, especially loss functions.
  • Enhance interoperability with other libraries and frameworks, particularly when working with ONNX models.
  • Improve usability and reduce the need for workarounds.
  • Optimize performance by avoiding unnecessary data copying between devices.

Potential Challenges

  • Ensuring backward compatibility with existing code and models.
  • Handling edge cases and maintaining consistency with broadcasting rules.
  • Optimizing performance for operations involving 0-dimensional tensors, especially in the context of loss computation.

Next Steps

  1. Discuss and refine the proposed solution with the Burn community.
  2. Create a detailed implementation plan and allocate resources.
  3. Implement the necessary changes and additions to support 0-dimensional tensors, with a focus on loss computation and ONNX compatibility.
  4. Conduct thorough testing and address any issues or edge cases, including tests for loss functions and ONNX interoperability.
  5. Update the documentation and examples, highlighting the usage of 0-dimensional tensors in loss computation and ONNX scenarios.
  6. Release a new version of Burn with 0-dimensional tensor support.

We believe that adding support for 0-dimensional tensors will significantly enhance the capabilities and usability of the Burn deep learning framework in Rust, particularly in the context of loss computation and ONNX interoperability. We look forward to feedback and collaboration from the community to make this feature a reality.

@antimora
Copy link
Collaborator Author

CC @nathanielsimard , @laggui , @louisfd

@antimora
Copy link
Collaborator Author

antimora commented Apr 24, 2024

@LaurentMazare has confirmed Candle supports 0D tensors:

Zermelo Fraenkel: Scalars values (tensors with 0 dimension) should be supported. Empty tensors (multiple dimensions but with one of them being zero) should also be supported but only to some extent. Certainly interested if you find places where this doesn't work properly.

PyTorch supports 0D tensors

@antimora antimora added the enhancement Enhance existing features label Apr 24, 2024
@antimora
Copy link
Collaborator Author

cc @ashdtu

@antimora
Copy link
Collaborator Author

@laggui found that Ndarray supports 0dim arrays: https://docs.rs/ndarray/latest/ndarray/type.Array0.html

@antimora
Copy link
Collaborator Author

@nathanielsimard and I had an offline conversation.

Here's a summary of the conversation for others:

We discussed the need to support scalar tensors in the Burn deep learning framework. While scalar values can be encoded as rank-1 tensors, the main issue is the lack of an automatic broadcasting API in Rust stable due to limitations with const generics.

As a better long-term solution, we proposed introducing a new Scalar type, which would be an enum that can hold either a native value (e.g., f32) or a rank-1 tensor. This explicit Scalar type would provide more security and avoid unnecessary broadcast operations. It would also be beneficial for exporting to other formats like ONNX, since all operation can be tracked in a computation graph.

We plan to modify the burn_tensor module to include this Scalar type, with variants like Scalar<Int>, Scalar<Float>, and Scalar<Bool>. This change would not introduce any breaking changes to the existing API.

Overall, while the naming and exact implementation details still need to be finalized, we agreed that introducing a dedicated Scalar type is a good idea to handle scalar values properly in the Burn framework.

@laggui
Copy link
Member

laggui commented Oct 8, 2024

While most of the current discussion has been around 0-dim tensors for scalar support, we also had two other supporting arguments around that on discord this morning for 0-length or 0-size tensor (i.e., where one of the dimensions is zero):

  • Sparse tensors (when a tensor has only zeros in a dimension, that dimension would become 0 when converted to its sparse representation)
  • Current ops like argwhere() already return 0-size tensors for inputs that are all zeros (Nonzero should return an empty vec for zero tensors #2212 fixed the crash for the operation results, but using that tensor with other operations would lead to other problems)

This is related to scalar support (0-dim) but could possibly tracked as a separate issue.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement Enhance existing features
Projects
None yet
Development

No branches or pull requests

2 participants