-
-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
fix: ID Mismatch Error in VectorDB During Evaluation #1033 #1056
base: main
Are you sure you want to change the base?
Conversation
…217/AutoRAG into fix/id-mismatch-with-vectordb
This PR may not fully align with your intentions in autorag. I tried to consider as many cases as possible, but there may be aspects you have been concerned about that I am unaware of. I understand that it might not be approved, but I would appreciate any feedback you can provide. Thank you. |
@e7217 Thank you for the PR! And apologize for the late review. |
@e7217 Actually we discussed about the structure that do not use corpus_df at all for the AutoRAG structure. Pros
Cons
|
description
Hello
I am suggesting some code changes to address issue #1033. The error occurs when an item in the vectordb is searched, but its ID does not match the ID of the raw_doc corpus. I think the retriever aims to retrieve the item with the highest score. To address this, I have added a key for the
content
. While this change may require additional storage capacity for the vectordb, it's similar to how Langchain uses apage_content
key.I have modified some code, but I have only referred to the documentation and have not run the code in practice, so there may be errors.
I appreciate your review. Thank you.
references