Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Bug: segfault on clustering #529

Open
2 of 3 tasks
vibl opened this issue Nov 8, 2024 · 2 comments
Open
2 of 3 tasks

Bug: segfault on clustering #529

vibl opened this issue Nov 8, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@vibl
Copy link

vibl commented Nov 8, 2024

Describe the bug

I get a segmentation fault on index.cluster(), whatever the parameters min_count and max_count I use (and without parameters).

Small dataset of 80,000 vectors. index.search works great.

Steps to reproduce

    index = Index(
        ndim=768,
        metric='cos',
        dtype='f32'
    )

    index.save(usearch_index_path)
    index = Index.restore(usearch_index_path)
    
    clustering = index.cluster(min_count=10, max_count=15, log=True)

Expected behavior

It should return a Clustering instance.

USearch version

2.16.2

Operating System

Ubuntu 24.04

Hardware architecture

x86

Which interface are you using?

Python bindings

Contact Details

No response

Are you open to being tagged as a contributor?

  • I am open to being mentioned in the project .git history as a contributor

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct
@vibl vibl added the bug Something isn't working label Nov 8, 2024
@ashvardanian
Copy link
Contributor

Hi @vibl! Sorry for delayed response! Can you check out the global clustering functionality as opposed to the built-in into the Index Python class? It should work much better.

@m12sl
Copy link

m12sl commented Nov 29, 2024

Hi!
I have the same problem.
.load from file and segfault on clustering.

In my case segfault is observing on this line:

results = self._compiled.cluster_keys(

Can you check out the global clustering functionality as opposed to the built-in into the Index Python class?

Could you please suggest how to do this?
some_index.cluster() provides quite good results. But I don't understand how to use Clustering class, especially how to form batch_matches

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants