-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
jdbc-input bug: clean_run: true and/or record_last_run: false doesn't work #121
Comments
Acc to documentation So this is how it should be used for the :sql_last_run to be reverted to 0 or 0 timestamp (1970...) for each pipeline start!clean_run => true And in here it seems like the clean_run has effect only when record_last_run is true.
And also this patch conditions the file update on record_last_run, but this is NOT NEEDED! Also there is a pleonasm in this code
The if is useless. |
So after some digging, it turns out that the last_sql_value is cached because class ValueTracking is instantiated once and it is not read from the last_run_metadata_path unless logstash is restarted.... |
@s137 for cursor pagination with tracking_column => "id" // or any unique I think the solution would be: WHEN the pipeline exits because there are 0 retrieved rows: if clean_run is true set value to 0 or 1970... to the last_run_metadata_path file UPDATE: |
Update. The scheduler spoils everything up running just 1 query at a time. Without scheduler, the query is repeated until there are no more rows to be ingested... |
So the final solution was to not use clean_run true at all. So I created 2 pipelines, one with cursor paginate and one with offset paginate:
sql WHERE updated_at IS NOT NULL AND updated_at > DATE_SUB(NOW(), INTERVAL 2 MINUTE) LIMIT :size OFFSET :offset Cursor paginate:
sql:
This will execute each statements once, then restart logstash and execute them again until logstash is stopped. If I want to re-ingest all, i just have to delete manually the files from last_run_metadata_path (stop logstash, delete the index, create the mappings, and then restart logstash). The scheduler from logstash is not compatible with cursor paginate and :sql_last_value. |
Logstash information:
JVM (e.g.
java -version
): Bundled JDK:openjdk version "17.0.4" 2022-07-19
OpenJDK Runtime Environment Temurin-17.0.4+8 (build 17.0.4+8)
OpenJDK 64-Bit Server VM Temurin-17.0.4+8 (build 17.0.4+8, mixed mode, sharing)
OS version: Windows 10
Description of the problem including expected versus actual behavior:
According to the docs (https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html#_state) setting
clean_run
to true should set the value of:sql_last_value
to 0 or '1970-01-01 00:00:00', if its a datetime value, for every execution.But it only works for the first execution, after that it updates the value to the last execution time, even if I also set
record_last_run
to false.Steps to reproduce:
You can reproduce the issue with this input:
This is the same issue that @palin first encountered and put up on the old jdbc-input-plugin repository, see here for more details:
logstash-plugins/logstash-input-jdbc#373
The text was updated successfully, but these errors were encountered: