Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

community[minor]: add mongodb byte store #23876

Merged
merged 14 commits into from
Jul 19, 2024

Conversation

pprados
Copy link
Contributor

@pprados pprados commented Jul 4, 2024

The MongoDBStore can manage only documents.
It's not possible to use MongoDB for an CacheBackedEmbeddings.

With this new implementation, it's possible to use:

CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings=embeddings,
    document_embedding_cache=MongoDBByteStore(
      connection_string=db_uri,
      db_name=db_name,
      collection_name=collection_name,
  ),
)

and use MongoDB to cache the embeddings !

Copy link

vercel bot commented Jul 4, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Jul 17, 2024 1:41pm

@pprados pprados marked this pull request as ready for review July 4, 2024 14:47
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. community Related to langchain-community 🔌: mongo Primarily related to Mongo integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Jul 4, 2024
@eyurtsev eyurtsev changed the title common[small]: add mongodb byte store community[minor]: add mongodb byte store Jul 5, 2024
@eyurtsev
Copy link
Collaborator

eyurtsev commented Jul 5, 2024

@pprados could you add an integration test using langchain_standard_tests

Look at the following unit tests as examples:

Essentially you only need to import the test suite and inherit from it and then provide a single fixture

class TestInMemoryStore(BaseStoreSyncTests):
    @pytest.fixture
    def three_values(self) -> Tuple[bytes, bytes, bytes]:  # <-- Provide 3 
        return b"foo", b"bar", b"buzz"

    @pytest.fixture
    def kv_store(self) -> Store:
       yield implementation using mongodb that has no keys stored in it

@eyurtsev eyurtsev self-assigned this Jul 5, 2024
@pprados pprados marked this pull request as draft July 12, 2024 13:27
@pprados pprados marked this pull request as ready for review July 12, 2024 14:03
@pprados
Copy link
Contributor Author

pprados commented Jul 18, 2024

@eyurtsev
The new version use the generic test.

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jul 19, 2024
@eyurtsev eyurtsev merged commit f585668 into langchain-ai:master Jul 19, 2024
43 checks passed
olgamurraft pushed a commit to olgamurraft/langchain that referenced this pull request Aug 16, 2024
The `MongoDBStore` can manage only documents.
It's not possible to use MongoDB for an `CacheBackedEmbeddings`.

With this new implementation, it's possible to use:
```python
CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings=embeddings,
    document_embedding_cache=MongoDBByteStore(
      connection_string=db_uri,
      db_name=db_name,
      collection_name=collection_name,
  ),
)
```
and use MongoDB to cache the embeddings !
@pprados pprados deleted the pprados/add-mongodb-byte-store branch October 7, 2024 07:08
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
community Related to langchain-community 🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. 🔌: mongo Primarily related to Mongo integrations size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants