Skip to content

BUG: Index.get_indexer_not_unique inconsistent return types vs get_indexer #16819

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
ri938 opened this issue Jul 3, 2017 · 6 comments · Fixed by #16826
Closed

BUG: Index.get_indexer_not_unique inconsistent return types vs get_indexer #16819

ri938 opened this issue Jul 3, 2017 · 6 comments · Fixed by #16826
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@ri938
Copy link
Contributor

ri938 commented Jul 3, 2017

See https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.get_indexer_non_unique.html where the return value of Index.get_indexer_non_unique indexer is stated to be ndarray whereas actually this is converted into a Index before it is returned

indexer, missing = self._engine.get_indexer_non_unique(tgt_values)
return Index(indexer), missing

@jreback
Copy link
Contributor

jreback commented Jul 4, 2017

this is techincally wrong, it should return a platform ndarray as the indexer, see #16820 , similar to .get_indexer()

@jreback jreback closed this as completed Jul 4, 2017
@jreback jreback added the Indexing Related to indexing on series/frames, not to indexes themselves label Jul 4, 2017
@jreback jreback added this to the No action milestone Jul 4, 2017
@jreback
Copy link
Contributor

jreback commented Jul 4, 2017

actually repurposing this issue to fix the bug.

#16820 (comment)
need some tests in tests/indexes/test_base.py which exercise this comparison for all indexers

e.g. something like this will expose the bug

def get_indexer_consistency(self):

       for name, index in self.indices.items():
             expected = index.get_indexer(index[0:2])
             result, _ = index.get_indexer_non_unique(index[0:2])
             tm.assert_numpy_array_equal(result, expected)

@jreback jreback reopened this Jul 4, 2017
@jreback jreback added the Bug label Jul 4, 2017
@jreback jreback modified the milestones: Next Major Release, No action Jul 4, 2017
@jreback jreback changed the title DOC: Index.get_indexer_not_unique wrong documented return value BUG: Index.get_indexer_not_unique inconsistent return types vs get_indexer Jul 4, 2017
@jreback
Copy link
Contributor

jreback commented Jul 4, 2017

@ri938 love to have a PR!

@ri938
Copy link
Contributor Author

ri938 commented Jul 4, 2017

So you would like for conversion to index in "return Index(indexer)", from get_indexer_non_unique to be removed so this returns the correct type? (and tests)

@jreback
Copy link
Contributor

jreback commented Jul 4, 2017

yes I think that would work (see what breaks)

@ri938
Copy link
Contributor Author

ri938 commented Jul 4, 2017

Created PR #16826

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants