Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ANN support? #7

Open
astoilkov opened this issue Jun 17, 2023 · 2 comments
Open

ANN support? #7

astoilkov opened this issue Jun 17, 2023 · 2 comments

Comments

@astoilkov
Copy link

I see the implementation uses cosine similarity. Performance gains come from normalizing the embeddings and caching them.

Have you considered ANN? I guess something like https://github.com/DanielKRing1/Annoy.js?

@MentalGear
Copy link

Interesting, how would you implement this?
Are there possible drawbacks, for example where recall range is exchanged for speed ?

@astoilkov
Copy link
Author

Interesting, how would you implement this?

I think the most popular way to implement it is using Approximate K Nearest Neighbor. However, I should note that I'm not knowledgeable in that area.

Are there possible drawbacks, for example where recall range is exchanged for speed ?

Yes, the algorithm makes such a tradeoff — a little less accurate for a massive speed bump when the dataset is large. This is what you can expect from commercial vector databases (example: https://supabase.com/vector).

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants