Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

API: Add and redefine numpy.bool [Array API] #25080

Merged
merged 3 commits into from
Nov 24, 2023
Merged

Conversation

mtsokol
Copy link
Member

@mtsokol mtsokol commented Nov 6, 2023

Hi @rgommers @ngoldbaum @seberg,

One of the members of Array API is bool. Currently np.bool_ is present and I think we already had some initial conversations in NEP 52 issues about renaming it to np.bool.

Would you like to keep both names in the long term? Or should I remove np.bool_ usages and deprecate it?

@mtsokol mtsokol self-assigned this Nov 6, 2023
@mtsokol mtsokol marked this pull request as draft November 6, 2023 17:01
@ngoldbaum
Copy link
Member

Some context from prior discussions: #24306 (comment)

The deprecation happened in 1.20 (Jan 2021) and was finalized in NumPy 1.24 (Dec 2022).

Is a year enough time for projects to migrate? I'm honestly not sure. Some light github code searches indicate that usages of "np.bool" are mostly in code that hasn't been touched in several years.

@rgommers
Copy link
Member

rgommers commented Nov 6, 2023

I don't see any problem with putting np.bool back. We should never have even removed it I believe; IIRC we discussed restoring it immediately because the deprecation was not a good choice. We just failed to follow through on that.

Either way, a couple more arguments:

  • 2.0 is a major release, so a good time for this
  • This is an essential part of array API support, which we already decided we wanted to do back in the 2.0 developer meeting in April (NEP to expand on that to follow)
  • Code that hasn't been touched in several years isn't really that interesting, it's probably still not going to be touched in 1 or 2 years from now
  • The linked discussion in gh-24306 is mostly about the True_ and False_ singletons, which is related but not the same thing. And 99.x% of usage is through dtype=np.bool keywords, which is compatible with the old usage. Calling np.bool(a_value) is very niche in comparison.

@rgommers
Copy link
Member

rgommers commented Nov 6, 2023

Or should I remove np.bool_ usages and deprecate it?

We should keep that for quite a while, in order to not force people to write if/else stuff depending on the numpy version. In the long run it can go, but there is no hurry I think.

@rgommers rgommers added this to the 2.0.0 release milestone Nov 6, 2023
@ngoldbaum
Copy link
Member

Code that hasn't been touched in several years isn't really that interesting, it's probably still not going to be touched in 1 or 2 years from now

Agreed, I was trying to say that the migration appears mostly complete in downstream code.

I think a year is probably long enough given Ralf's points.

@mtsokol mtsokol force-pushed the numpy-bool branch 2 times, most recently from f546273 to a9180c0 Compare November 7, 2023 11:19
@mtsokol mtsokol marked this pull request as ready for review November 7, 2023 11:20
@mtsokol mtsokol changed the title API: Reintroduce numpy.bool API: Reintroduce numpy.bool [Array API] Nov 7, 2023
Copy link
Member

@BvB93 BvB93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main numpy/__init__.pyi file also needs a bool = bool_ declaration for type checkers to pick up the alias.

This is actually gona be somewhat cumbersome however, as this would result in a naming conflict with the builtin bool type that can only be resolved via replacing all previous mentions of bool in the module with an explicit builtins.bool.

For reference: we've had similar issues before inside the dtype class due to a dtype.str/builtins.str naming conflict (xref #20224)

@seberg
Copy link
Member

seberg commented Nov 7, 2023

If we reintroduce bool the name should maybe be the main name and bool_ the (hidden?) alias? (this only really changes how np.bool_ is printed)
There is also the open question if people prefer np.bool(True) over np.True_.

@mtsokol
Copy link
Member Author

mtsokol commented Nov 7, 2023

If we reintroduce bool the name should maybe be the main name and bool_ the (hidden?) alias? (this only really changes how np.bool_ is printed) There is also the open question if people prefer np.bool(True) over np.True_.

Right now np.bool is an alias for np.bool_. The other way around is fine for me - it will require a bit more refactoring, because in numpy/core there are a few places with from ... import * and numpy's bool would override builtin bool (as described by @BvB93).

@rgommers
Copy link
Member

rgommers commented Nov 7, 2023

Good point, +1 for making np.bool the main/preferred name.

There is also the open question if people prefer np.bool(True) over np.True_.

Slight preference for np.bool(True) here, seems nicer - but also happy if others feel differently.

@ngoldbaum
Copy link
Member

Another +1 for np.bool(True). It also moves us more in the direction of eventually getting rid of np.True_ (which we decided was too disruptive for 2.0, but maybe not for 3.0...) instead of permanently enshrining it in the API.

@seberg
Copy link
Member

seberg commented Nov 7, 2023

One thing about np.True_/np.False_ is that they are true singletons and having them makes more obvious, but no big opinion.

@eric-wieser
Copy link
Member

eric-wieser commented Nov 7, 2023

I don't see any problem with putting np.bool back. We should never have even removed it I believe; IIRC we discussed restoring it immediately because the deprecation was not a good choice. We just failed to follow through on that.

I'm a bit confused by this claim; this PR is not putting np.bool back; it's replacing np.bool with something different to (and better than) what it previously contained, and the only sensible way to achieve that was to remove the old version first via a deprecation.

Perhaps the PR title and release note should be clear that this is not a "reintroduction", but a reallocation of the name to a new thing.

@rgommers
Copy link
Member

rgommers commented Nov 8, 2023

I'm a bit confused by this claim; this PR is not putting np.bool back; it's replacing np.bool with something different to (and better than) what it previously contained, and the only sensible way to achieve that was to remove the old version first via a deprecation.

Yes I understand that, and no it wasn't the only way to go about this. We now effectively made everyone change their code twice, while the vast majority of usage of the old np.bool, which are func(..., dtype=np.bool, would have worked fine when making the change to np.bool is np.bool_ without a deprecation. As would if np.bool(a_value) > 10: type things. In effect, we caused a lot of extra churn for limited benefit. It's a trade-off; making users change code twice is at least a "deprecation smell" though.

I am pretty sure we discussed undoing the np.bool deprecation after it was done, and concluded that we should probably undo that before the then-next release. We just then lost track of that, and never revisited the topic.

@mtsokol
Copy link
Member Author

mtsokol commented Nov 8, 2023

@rgommers @seberg I started working on making np.bool default one and np.bool_ a hidden alias (this will require ~1200 lines of code changed, mostly in typing to use np.bool - all of them simple changes).

I already pushed ~900 of them, I still need to fix a few pyi files - let me know if this scope of changes is fine/too disruptive for you.

@mtsokol mtsokol changed the title API: Reintroduce numpy.bool [Array API] API: Add and redefine numpy.bool [Array API] Nov 8, 2023
@rgommers
Copy link
Member

rgommers commented Nov 8, 2023

If they're all simple replacements then I think it's fine to do now - but it'd be completely fine to not do it as well. In the end what matters is what we write in the docs I think (and the repr etc. - user-visible stuff); the bool_ -> bool replacements can happen later too.

@seberg
Copy link
Member

seberg commented Nov 8, 2023

I honestly don't care if we do it now. I was mostly thinking of what the repr() shows.

@mtsokol mtsokol force-pushed the numpy-bool branch 2 times, most recently from 0c9b293 to 3ae3d00 Compare November 9, 2023 13:16
@mtsokol
Copy link
Member Author

mtsokol commented Nov 9, 2023

Ok, now reprs for False and True are updated. The PR is ready for a review.

@mtsokol
Copy link
Member Author

mtsokol commented Nov 16, 2023

During the triage call we decided that reprs for np.bool should stay as np.True_ and np.False_ (instead of new np.bool(True)). I will revert that change if there is no caveat. [Update] Reverted.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking quite good. Everything works as far as I can tell. I have a couple of small comments inline. My main one is about the type stubs, where you replaced from numpy import _bool with from numpy import bool. And as a result have a lot of bool -> builtins.bool changes. Those may in some cases be useful anyway for clarity, however wouldn't it be better to not do from numpy import bool?

Then you'd have to write

NDArray[np.bool]

instead of

NDArray[bool]

but that seems useful anyway.

numpy/_core/defchararray.pyi Outdated Show resolved Hide resolved
@@ -8,7 +9,7 @@ from numpy import (
int64,
intp,
float16,
bool_,
bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff would be a lot smaller if you'd leave out this import. Using the builtin can then stay unchanged. Seems preferable to me to use np.bool (unless that doesn't work?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to np.bool.

False_ = bool_(False)
True_ = bool_(True)
False_ = nt.bool(False)
True_ = nt.bool(True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using np.bool would be clearer; this relies on a weird replacement thing in numerictypes.py - it does a from builtins import bool and then overwrites that with:

# Now add the types we've determined to this module
for key in allTypes:
    globals()[key] = allTypes[key]
    __all__.append(key)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed it, but with np.bool importing NumPy fails with:

AttributeError: partially initialized module 'numpy' has no attribute 'bool' (most likely due to a circular import)

as numpy.core module is imported eagerly (__init__.py imports core/numeric.py which imports __init__.py).

I would leave nt.bool, or import it locally with a different name? WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. In that case, maybe leave it as is, but ensure that there's a test that _core.numerictypes.bool is np.bool to guard against a future issue there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I added a test for that.

@mtsokol
Copy link
Member Author

mtsokol commented Nov 20, 2023

And as a result have a lot of bool -> builtins.bool changes. Those may in some cases be useful anyway for clarity, however wouldn't it be better to not do from numpy import bool?

Sure - I can use np.bool for NumPy's bool and bool for builtin, but I will update it this way in all stub files, so it's consistent across the codebase.

@rgommers
Copy link
Member

Sure - I can use np.bool for NumPy's bool and bool for builtin, but I will update it this way in all stub files, so it's consistent across the codebase.

Seems like a good idea to me, but let's ask @BvB93 if he has any preference for these stub files.

@mtsokol
Copy link
Member Author

mtsokol commented Nov 22, 2023

Sure - I can use np.bool for NumPy's bool and bool for builtin, but I will update it this way in all stub files, so it's consistent across the codebase.

Seems like a good idea to me, but let's ask @BvB93 if he has any preference for these stub files.

@rgommers I updated the PR - now NumPy's bool is accessed as np.bool only, and builtin bool as bool.

Could we merge this PR? If we come up with another preference/arrangement for stub files I will adjust it in a separate one.

@rgommers
Copy link
Member

The one CI failure is RuntimeWarning: invalid value encountered in matmul, at first sight that looks unrelated.

@seberg
Copy link
Member

seberg commented Nov 23, 2023

The one CI failure is RuntimeWarning: invalid value encountered in matmul, at first sight that looks unrelated

That has been a heisenbug (probably in openblas) since forever.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we merge this PR? If we come up with another preference/arrangement for stub files I will adjust it in a separate one.

Agreed. This all looks solid, so let's go with it. Thanks Mateusz & reviewers!

@rgommers rgommers merged commit 94bc564 into numpy:main Nov 24, 2023
57 of 58 checks passed
@mtsokol mtsokol deleted the numpy-bool branch November 24, 2023 11:56
charliermarsh pushed a commit to astral-sh/ruff that referenced this pull request Jun 4, 2024
Hi!

This PR addresses #11093.

It skips `np.bool` and `np.long` replacements as both of these names
were reintroduced in NumPy 2.0 with a different meaning
(numpy/numpy#24922,
numpy/numpy#25080).
With this change `NPY001` will no longer conflict with `NPY201`. For
projects using NumPy 1.x `np.bool` and `np.long` has been deprecated and
removed long time ago, and accessing them yields an informative error
message.
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants