Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

BUG: groupby with dropna=False and pa.dictionary drops NA values #60567

Open
rhshadrach opened this issue Dec 14, 2024 · 0 comments · May be fixed by #60777
Open

BUG: groupby with dropna=False and pa.dictionary drops NA values #60567

rhshadrach opened this issue Dec 14, 2024 · 0 comments · May be fixed by #60777
Labels
Arrow pyarrow functionality Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Comments

@rhshadrach
Copy link
Member

df = pd.DataFrame({'A': ['a1', pd.NA]}, dtype=pd.ArrowDtype(pa.dictionary(pa.int32(), pa.utf8())))
print(df.groupby("A", dropna=False)[["A"]].first())
#      A
# A     
# a1  a1

There should be a 2nd row with the NA value since dropna=False.

@rhshadrach rhshadrach added Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Arrow pyarrow functionality labels Dec 14, 2024
@asharmalik19 asharmalik19 linked a pull request Jan 23, 2025 that will close this issue
5 tasks
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Arrow pyarrow functionality Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant