Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix bug with join with duplicate obs indices #822

Merged
merged 4 commits into from
Jan 13, 2025

Conversation

melonora
Copy link
Collaborator

@melonora melonora commented Jan 8, 2025

Closes #815

The current way the join works on main does not account for duplicate indices in table.obs.index. This leads to a problem when performing the join as the indices are used to subset the table.

The implemented fix here gets the indices of the masked elements and then does a boolean masking on the obs dataframe with the index reset. This retrieves the integer indices which can then be used for subsetting the table. The behaviour of matching rows stays unchanged with this change.

from pathlib import Path
import spatialdata as sd
import spatialdata_plot

visium_zarr_path = Path("C:/Users/w-mv/PycharmProjects/spatialdata-notebooks/notebooks/examples/visium_brain.zarr")
visium_sdata = sd.read_zarr(visium_zarr_path)
(
    visium_sdata.pl.render_images(elements="ST8059050_hires_image")
    .pl.render_shapes(elements="ST8059050", color="mt-Co3")
    .pl.show()
)

now gives:
image

Copy link

codecov bot commented Jan 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.75%. Comparing base (54e7dd0) to head (4ca78e5).
Report is 11 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #822      +/-   ##
==========================================
+ Coverage   91.73%   91.75%   +0.01%     
==========================================
  Files          46       46              
  Lines        7128     7130       +2     
==========================================
+ Hits         6539     6542       +3     
+ Misses        589      588       -1     
Files with missing lines Coverage Δ
src/spatialdata/_core/query/relational_query.py 90.78% <100.00%> (+0.26%) ⬆️

@quentinblampey
Copy link
Contributor

Thanks @melonora for the quick fix 😊

@LucaMarconato LucaMarconato enabled auto-merge (squash) January 13, 2025 13:04
@LucaMarconato
Copy link
Member

Thanks @melonora! I just added some more tests. In doing this I found a case in which left_exclusive fails, for a problem related to indices but not due to the bug that this PR addresses. I tracked this in a separate issue and we can now merge this PR.

@LucaMarconato LucaMarconato merged commit 526a0a2 into scverse:main Jan 13, 2025
8 checks passed
@LucaMarconato LucaMarconato changed the title Join with duplicate obs indices Fix bug with join with duplicate obs indices Jan 20, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problem with join
3 participants