Adding new document to RAPTOR environment #27

akesh1235 · 2024-03-30T10:31:36Z

akesh1235
Mar 30, 2024

The RAPTOR looks interesting but I see a big limitation in case one wants to incrementally add information to a vectorstore (quite common in a production scenarios imo). Raptor only works by looking globally at the entire pool of documents, as summaries are iteratively computed on clusters. This produces a sort of "immutable" vectorstore. In other words, if a user wants to simply add a document to an existing vectorstore, the full Raptor pipeline would have to run again to take into account the new information in existing summaries, which may become quite expensive with many documents (both in terms of cost and latency of the operation). Maybe one could simply replace the most similar summary at each level? I'd love to hear how people will address this.

parthsarthi03 · 2024-03-31T07:54:41Z

parthsarthi03
Mar 31, 2024
Maintainer

Hi, yes this is a current limitation and one we are working on. There is an approach described in #20 which is quite close to the idea we are thinking about. We also happily welcome community contributions and PRs on this!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding new document to RAPTOR environment #27

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Adding new document to RAPTOR environment #27

akesh1235 Mar 30, 2024

Replies: 1 comment

parthsarthi03 Mar 31, 2024 Maintainer

akesh1235
Mar 30, 2024

parthsarthi03
Mar 31, 2024
Maintainer