Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add ability to force an immediate commit after indexing #2699

Closed
imotov opened this issue Jan 24, 2023 · 8 comments · Fixed by #3271
Closed

Add ability to force an immediate commit after indexing #2699

imotov opened this issue Jan 24, 2023 · 8 comments · Fixed by #3271
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@imotov
Copy link
Collaborator

imotov commented Jan 24, 2023

Is your feature request related to a problem? Please describe.
In my experience the simplest way to explain a difficult concept to a search engine user is by demonstrating this concept using a shell script that consists of several curl statements. I have been using this method for many years. You can find some examples of such scripts in imotov/elasticsearch-test-scripts repo.

Trying the same approach with quickwit uncovered 2 issues, one of which is absence of refresh operation that would force all indexed data to be searchable. It is necessary since most of the scripts need to perform the following operations:

  • recreate an index with certain mapping
  • index some test data
  • perform the search operation on such data

Describe the solution you'd like
Add a flag to the bulk indexing operation that would trigger the commit operation and block execution until all data indexed so far is available to searchers.

Describe alternatives you've considered
An alternative is to reduce commit_timeout_secs and add sleep to the script. It is a viable solution if you need to run this script only once, but it is detrimental to quick iterations while debugging the script.

Additional context
I found a relevant discussion in #1674 (comment) but since that issue is stalled and my motivation for this feature is quite different I would like to open a separate feature request.

@imotov imotov added the enhancement New feature or request label Jan 24, 2023
@fulmicoton fulmicoton added the help wanted Extra attention is needed label Jan 24, 2023
@fulmicoton
Copy link
Collaborator

It should be feasible in the pushapi at least...

@imotov
Copy link
Collaborator Author

imotov commented Jan 27, 2023

I will give it a shot.

@guilload
Copy link
Member

Let me give you some pointers then:

  1. request somehow reaches the control plane*
  2. control plane locates the relevant indexers and forwards them the request
  3. some actor on the target node receives the message and forwards it to the indexer actor, which then finally commits the split (I'm voluntarily imprecise here to let you discover on your own the actors that compose an indexing pipeline)

* we don't have a proper routing layer at the moment. That's my goal for February. On Kubernetes, our current solution is to route requests using ingress rules:

`api/v1/<index ID>/ingest*` → `indexers.<namespace>.svc.cluster.local`
`api/v1/<index ID>/search*` → `searchers.<namespace>.svc.cluster.local`
...

@imotov
Copy link
Collaborator Author

imotov commented Jan 27, 2023

Thanks for the pointers! I was actually thinking to piggyback on the ingest command rather than create a dedicated refresh/commit request. I feel like this command is pretty useless on its own without indexing data and more prone to abuse. Adding it as a parameter to the ingest command has clean and localized semantic of "I want all records in this request to be searchable before you come back to me". A standalone refresh command has a more vague semantic of "I want whatever is in flight right now anywhere in the index to be searchable" and has a global impact.

@guilload
Copy link
Member

That sounds much better, indeed.

@fulmicoton
Copy link
Collaborator

Move to quickwit 0.6

imotov added a commit to imotov/quickwit that referenced this issue Mar 14, 2023
Adds an option to force a commit or wait for the next commit while
issuing an ingest command. This commit only implements it for the
ingest endpoint. If all records in the batch fail, the command will
hang until at least one good document is indexed until quickwit-oss#2825 is merged
in or the document parsing is moved to the early stages.

See quickwit-oss#2699
trinity-1686a added a commit that referenced this issue Mar 17, 2023
* Add an option to force and wait for commit

Adds an option to force a commit or wait for the next commit while
issuing an ingest command. This commit only implements it for the
ingest endpoint. If all records in the batch fail, the command will
hang until at least one good document is indexed until #2825 is merged
in or the document parsing is moved to the early stages.

See #2699

* Add comments to the notification triggering logic

* Fix rest tests

* Remove extra current_offset setting

---------

Co-authored-by: trinity-1686a <trinity@quickwit.io>
imotov added a commit to imotov/quickwit that referenced this issue Mar 18, 2023
Adds an option to force a commit or wait for the next commit while issuing a bulk
request command.

See quickwit-oss#2699
imotov added a commit that referenced this issue Mar 18, 2023
Adds an option to force a commit or wait for the next commit while issuing a bulk
request command.

See #2699
@trinity-1686a
Copy link
Contributor

Is there still more work to do on this issue?

@imotov
Copy link
Collaborator Author

imotov commented Mar 21, 2023

@trinity-1686a as you mentioned in your review, such feature could really benefit from a high level integration test. So, I am waiting for #2984 to add it in. Besides this I think this is done.

imotov added a commit to imotov/quickwit that referenced this issue May 5, 2023
Adds integration tests for different commit modes.

Fixes quickwit-oss#2699
imotov added a commit that referenced this issue May 6, 2023
Adds integration tests for different commit modes.

Fixes #2699
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

4 participants