Missing implied functionality `add to existing` #20

isConic · 2024-03-20T16:03:47Z

Line 216 in 9d99cbd

# self.add_to_existing(docs)

.

Problem.
If I want to iteratively add documents to a raptor index. Whenever I call add_documents I am met with an un-avoidable message:

Warning: Overwriting existing tree. Did you mean to call 'add_to_existing' instead? (y/n):

For the 200 or so documents that I wish to index, typing y each time is not feasible. Especially if I attempt to scale to thousands or tens of thousands in the future.

Barring the inconvenience of there being no way to override to this, upon inspecting the code, I've realized that selecting yes or y just terminates the function without doing anything. In the code block that's responsible for handling the yes there is this line that's commented out, followed by a return statement:

#self.add_to_existing(docs)
return

Wishing to still iteratively add to the index, I've searched the codebase coming to the realization that the add_to_existing method does not actually exist anywhere in the codebase.

Speculation:.

Perhaps the intended approach to add_to_existing is too difficult. The approach may have initially been to build one large tree for all documents and then rebalance already constructed trees with the addition of new documents?

If avoiding solving that monolith tree-rebalancing problem is why add_to_exisitng does not exist, I speculate that if raptor instead maintained an index of several smaller disjointed trees, it would be more appropriate of an index as that allows for scaling. The entry point to a tree not being the top or root of a single tree, but any node that is semantically close to the question from an embedding space encompassing multiple trees (every file that's indexed gets a tree).

The algorithm would pull in context in both directions (up the hierarchy and down the hierarchy).

The text was updated successfully, but these errors were encountered:

parthsarthi03 · 2024-03-21T14:24:59Z

Hey, yes, we are working on this feature, (we welcome contributions as well!) and our approach is quite similar to the one you have described!

isConic · 2024-04-10T22:40:28Z

I might be interested in contributing but would like to make sure any contribution wouldn't conflict/contradict your approachto this or your vision. What exactly do you have in mind for add_to_existing?

isConic · 2024-05-17T18:50:44Z

@parthsarthi03 Any updates on this?

tanjaimpens · 2024-08-19T12:40:01Z

Any updates on this functionality?

parthsarthi03 added the enhancement New feature or request label Mar 21, 2024

medxiaorudan mentioned this issue May 23, 2024

multi doc #32

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing implied functionality `add to existing` #20

Missing implied functionality `add to existing` #20

isConic commented Mar 20, 2024 •

edited

Loading

parthsarthi03 commented Mar 21, 2024

isConic commented Apr 10, 2024 •

edited

Loading

isConic commented May 17, 2024

tanjaimpens commented Aug 19, 2024

Missing implied functionality add to existing #20

Missing implied functionality add to existing #20

Comments

isConic commented Mar 20, 2024 • edited Loading

parthsarthi03 commented Mar 21, 2024

isConic commented Apr 10, 2024 • edited Loading

isConic commented May 17, 2024

tanjaimpens commented Aug 19, 2024

Missing implied functionality `add to existing` #20

Missing implied functionality `add to existing` #20

isConic commented Mar 20, 2024 •

edited

Loading

isConic commented Apr 10, 2024 •

edited

Loading