Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

RFC: clarify broadcast_to semantics #823

Open
adityagoel4512 opened this issue Jul 17, 2024 · 7 comments
Open

RFC: clarify broadcast_to semantics #823

adityagoel4512 opened this issue Jul 17, 2024 · 7 comments
Labels
Maintenance Bug fix, typo fix, or general maintenance. Narrative Content Narrative documentation content. topic: Broadcasting Array broadcasting.
Milestone

Comments

@adityagoel4512
Copy link

adityagoel4512 commented Jul 17, 2024

I'm finding the broadcast_to specification a little underspecified. In the docs we see the following for the shape parameter:

shape (Tuple[int, ...]) – array shape. Must be compatible with x (see Broadcasting). If the array is incompatible with the specified shape, the function should raise an exception.

The broadcasting link goes on to specify bidirectional broadcasting. That would imply to me that np.broadcast_to(np.asarray([[-1, -1], [-1, -1]]), (2, 1, 2)) should work since shapes (2, 2) and (2, 1, 2) are bidirectionally compatible. Somewhat reasonably in my opinion, NumPy did not interpret this in that way and raises an exception.

Since np.broadcast_to(np.asarray([[-1, -1], [-1, -1]]), (1, 2, 2)) does work, it seems that broadcasting compatibility is unidirectional. i.e. x.shape must be broadcastable to shape. Is it worth spelling out explicitly the difference in how this works, like ONNX does? I couldn't find any explanation in the standard itself.

It does say the following, although I read the "a specified shape" part as "any shape" rather than simply the shape parameter.

Returns:
out (array) – an array having a specified shape. Must have the same data type as x.

If this ambiguity is shared I am happy to contribute a clarification.

@asmeurer
Copy link
Member

I agree it should be updated. The unidirectional broadcasting is important for in-place operators (see https://data-apis.org/array-api/latest/API_specification/broadcasting.html#in-place-semantics).

@cbourjau
Copy link
Contributor

@asmeurer While I'm not aware of a read-only concept in the array API, I just wanted to point out that NumPy's broadcast_to does place that restriction on the returned array: https://numpy.org/doc/stable/reference/generated/numpy.broadcast_to.html#numpy-broadcast-to

@asmeurer
Copy link
Member

Read-only isn't a concept that's in the array API. Not all libraries might implement it. The array API leaves all mutation with views undefined so this isn't an issue.

@seberg
Copy link
Contributor

seberg commented Jul 18, 2024

It is a slight issue, since += can't work on a read-only arrays, but I agree it's niche enough to not worry about it really.

Not sure I like the "bidirectional" term, but happy if Aaron is. It might be good to just say that broadcastable arrays means that there is a common broadcast shape that both arrays can be "broadcast to" (the algorithm describing how to find said broadcast shape).
I think I feel "bidirectional" might make you think you could ever shrink a dimension if the values are identical along it, which isn't a concept (and I don't think that needs to be explained anywhere).

@rgommers
Copy link
Member

Agreed that it's a slight issue at the moment - more for NumPy than for the array API standard though. The ideal solution would be something like copy-on-write for NumPy, which could be introduced in a backwards compatible way since += & co now raise an exception.

@seberg
Copy link
Contributor

seberg commented Jul 18, 2024

more for NumPy than for the array API standard though

I'll have to disagree until I see a clearer plan on how NumPy could introduce CoW (for read-only arrays?) while not generally breaking view semantics in a way that the whole world gets wrong results. (And NumPy isn't the only array library that uses view semantics!)

@asmeurer
Copy link
Member

I can't say that I find the names "bidirectional" and "unidirectional" particularly appealing. I personally think of broadcasting as a (non-closed) binary operation on shape tuples. "Unidirectional" broadcasting is a special case where the result of the broadcast has to be the same as the second shape.

@kgryte kgryte added this to the v2024 milestone Sep 19, 2024
@kgryte kgryte added Maintenance Bug fix, typo fix, or general maintenance. Narrative Content Narrative documentation content. topic: Broadcasting Array broadcasting. labels Sep 19, 2024
@kgryte kgryte changed the title broadcast_to semantics RFC: clarify broadcast_to semantics Sep 19, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Maintenance Bug fix, typo fix, or general maintenance. Narrative Content Narrative documentation content. topic: Broadcasting Array broadcasting.
Projects
None yet
Development

No branches or pull requests

6 participants