Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

geom_sina fails when index doesn't contain 0 #888

Closed
redst4r opened this issue Nov 12, 2024 · 1 comment
Closed

geom_sina fails when index doesn't contain 0 #888

redst4r opened this issue Nov 12, 2024 · 1 comment
Labels

Comments

@redst4r
Copy link

redst4r commented Nov 12, 2024

As suggested in #221 , geom_sina fails (as of plotnine="0.14.1") when called on a dataframe whose index does not contain the 0 element:

# make up some random data
df_fails = pd.DataFrame({
    'x': ['a','a','a','b','b','b'],
    'y': [1, 2 ,3 ,4 ,5 ,6]
}, index=[1,2,3,4,5,6])    # note that 0 is NOT in the index
pn.ggplot(df_fails) + pn.aes(x='x', y='y') + pn.geom_sina()

yields

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /opt/conda/lib/python3.11/site-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self, key)
   3804 try:
-> 3805     return self._engine.get_loc(casted_key)
   3806 except KeyError as err:

File index.pyx:167, in pandas._libs.index.IndexEngine.get_loc()

File index.pyx:196, in pandas._libs.index.IndexEngine.get_loc()

File pandas /_libs/hashtable_class_helper.pxi:2606] in pandas._libs.hashtable.Int64HashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:2630, in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 0

It traces back to this line

File /opt/conda/lib/python3.11/site-packages/plotnine/mapping/aes.py:647 in has_groups(data)
    632 """
    633 Check if data is grouped
    634 
   (...)
    643     If True, the data has groups.
    644 """
    645 # If any row in the group column is equal to NO_GROUP, then
    646 # the data all of them are and the data has no groups
--> 647 return data.loc[0, "group"] != NO_GROUP

Unfortunately I don't have time to look at the code more closely, but clearly this hard indexing via data.loc[0, "group"] will fail whenever data doesn't have 0 in its index.

Current workaround

Just reset the index via df_fails = df_fails.reset_index(drop=True) as pointed out by @idavi-bcs in #221 (comment)

@redst4r redst4r changed the title geom_sina fails with index doesn't contain 0 geom_sina fails when index doesn't contain 0 Nov 12, 2024
@has2k1 has2k1 added the bug label Nov 12, 2024
@has2k1 has2k1 closed this as completed in 6cb4c1d Nov 13, 2024
@has2k1 has2k1 added the Addressed in Next Version The issue has been fixed, but it has not in the current official release. label Nov 13, 2024
@has2k1
Copy link
Owner

has2k1 commented Nov 21, 2024

The fix for this bug has been released in v0.14.2.

@has2k1 has2k1 removed the Addressed in Next Version The issue has been fixed, but it has not in the current official release. label Nov 21, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants