-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add high throughput integration test #5655
base: main
Are you sure you want to change the base?
Conversation
pub fn merge(self, other: RestIngestResponse) -> Self { | ||
Self { | ||
num_docs_for_processing: self.num_docs_for_processing + other.num_docs_for_processing, | ||
num_ingested_docs: apply_op(self.num_ingested_docs, other.num_ingested_docs, |a, b| { | ||
a + b | ||
}), | ||
num_rejected_docs: apply_op(self.num_rejected_docs, other.num_rejected_docs, |a, b| { | ||
a + b | ||
}), | ||
parse_failures: apply_op(self.parse_failures, other.parse_failures, |a, b| { | ||
a.into_iter().chain(b).collect() | ||
}), | ||
num_too_many_requests: self.num_too_many_requests, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this back here as it makes more sense than in the API model because accumulating responses is quite specific to the rest client.
de8f2a1
to
aa98399
Compare
aa77d7e
to
c2069e4
Compare
c2069e4
to
ce4501f
Compare
// TODO: when using the default 10MiB batch size, we get persist | ||
// timeouts with code 500 on some lower performance machines (e.g. | ||
// Github runners). We should investigate why this happens exactly. | ||
Some(5_000_000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guilload I didn't find a good explanation for why this timeout occur here in the persist
quickwit/quickwit/quickwit-ingest/src/ingest_v2/router.rs
Lines 432 to 443 in ce4501f
let persist_result = tokio::time::timeout( | |
PERSIST_REQUEST_TIMEOUT, | |
ingester.persist(persist_request), | |
) | |
.await | |
.unwrap_or_else(|_| { | |
let message = format!( | |
"persist request timed out after {} seconds", | |
PERSIST_REQUEST_TIMEOUT.as_secs() | |
); | |
Err(IngestV2Error::Timeout(message)) | |
}); |
Persisting 10MB should not take 6 sec, even on a slow system and in debug mode, should it?
Description
This PR is a reuse of the tests and docs proposed in #5644, which itself is not necessary anymore after the the status code was fixed to be 429 when shards need scaling up (#5651).
It also adds a small indication of the number of retries that occurred to the CLI ingest command. This is handy for troubleshooting and shows concretely to users that retries are often necessary.
How was this PR tested?
Integration tests and running the CLI ingest command on the HDFS dataset.