Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Indexing on EXT Jenkins has become troublesome, possibly because of the increased load on the server from more aggressive security scanning and heavy morning cron jobs.
Two types of errors have interrupted indexing:
These appear to be specific to EXT Jenkins, because our DEV Jenkins instance (zusa) runs the same pipeline code against the same data, at the same time of the morning, and rarely fails to finish.
The S3 interruptions are not too disruptive, because the downloads that succeed won't have to be run again on a retry, and little time is lost.
The timeout error, however, is painful when indexing bombs late in the indexing run. In the last week, a timeout stopped indexing after 4 million complaints had been indexed, which causes the job to start over from 0.
This PR doubles the OpenSearch timeout value, which should not affect most runs, but could save the occasional late timeout.
I ran this morning's CCDB indexing using this branch, and it succeeded on the first try. That doesn't prove that the new value saved the run, but I think we should see if the new timeout reduces the churn.
Testing
In addition to test-running the new timeout value, I got the unit tests running again by upgrading python to 3.11 and adjusting the tox configs.