Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Increase retries in BigQueryQueryRunner #23011

Merged
merged 1 commit into from
Aug 19, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -115,10 +115,10 @@ public DistributedQueryRunner build()
Map<String, String> connectorProperties = new HashMap<>(ImmutableMap.copyOf(this.connectorProperties));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These properties are @ConfigHidden and so connector itself to users is vulnerable to temporary service issues like:

2024-08-12T11:32:51.7141753Z io.trino.testing.QueryFailedException: Service is unavailable. Please retry.
2024-08-12T11:32:51.7144073Z 	at io.trino.testing.AbstractTestingTrinoClient.execute(AbstractTestingTrinoClient.java:134)
2024-08-12T11:32:51.7146912Z 	at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:582)
2024-08-12T11:32:51.7149416Z 	at io.trino.testing.DistributedQueryRunner.execute(DistributedQueryRunner.java:565)
2024-08-12T11:32:51.7151698Z 	at io.trino.testing.QueryRunner.execute(QueryRunner.java:85)
2024-08-12T11:32:51.7153609Z 	at io.trino.testing.sql.TestTable.createAndInsert(TestTable.java:52)
2024-08-12T11:32:51.7155422Z 	at io.trino.testing.sql.TestTable.<init>(TestTable.java:47)
2024-08-12T11:32:51.7157266Z 	at io.trino.testing.sql.TestTable.<init>(TestTable.java:39)
2024-08-12T11:32:51.7159866Z 	at io.trino.testing.BaseConnectorTest.testSetFieldTypeInArray(BaseConnectorTest.java:3251)
2024-08-12T11:32:51.7162337Z 	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
2024-08-12T11:32:51.7164279Z 	at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
2024-08-12T11:32:51.7166075Z 	at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
2024-08-12T11:32:51.7168139Z 	Suppressed: java.lang.Exception: SQL: CREATE TABLE test_set_field_type_in_array_x9sg74px0p (col array(row(field int)))
2024-08-12T11:32:51.7170791Z 		at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:589)
2024-08-12T11:32:51.7172656Z 		... 9 more
2024-08-12T11:32:51.7174183Z Caused by: com.google.cloud.bigquery.BigQueryException: Service is unavailable. Please retry.
2024-08-12T11:32:51.7176575Z 	at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:116)
2024-08-12T11:32:51.7178935Z 	at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.create(HttpBigQueryRpc.java:202)
2024-08-12T11:32:51.7181151Z 	at com.google.cloud.bigquery.BigQueryImpl$2.call(BigQueryImpl.java:305)
2024-08-12T11:32:51.7183173Z 	at com.google.cloud.bigquery.BigQueryImpl$2.call(BigQueryImpl.java:302)
2024-08-12T11:32:51.7185686Z 	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:103)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would enable some retrying by default. Maybe even as high in tests. It is better when query is passing slowly, than failing temporarly fast.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can (and should) introduce retries by default using the SDK default values.

It might need change at

return RetrySettings.newBuilder()
.setMaxAttempts(retries)
.setTotalTimeout(org.threeten.bp.Duration.of(timeout.toMillis(), MILLIS))
.setInitialRetryDelay(org.threeten.bp.Duration.of(retryDelay.toMillis(), MILLIS))
.setRetryDelayMultiplier(retryMultiplier)
.setMaxRetryDelay(org.threeten.bp.Duration.of(maxDelay, MILLIS))
.build();
and making the configs "optional". If missing we use SDK defaults (which can change over time).

Defaults are documented at https://cloud.google.com/storage/docs/retry-strategy#tools

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me do it as separate PR

connectorProperties.putIfAbsent("bigquery.views-enabled", "true");
connectorProperties.putIfAbsent("bigquery.view-expire-duration", "30m");
connectorProperties.putIfAbsent("bigquery.rpc-retries", "4");
connectorProperties.putIfAbsent("bigquery.rpc-retries", "10");
connectorProperties.putIfAbsent("bigquery.rpc-retry-delay", "200ms");
connectorProperties.putIfAbsent("bigquery.rpc-retry-delay-multiplier", "1.5");
connectorProperties.putIfAbsent("bigquery.rpc-timeout", "8s");
connectorProperties.putIfAbsent("bigquery.rpc-timeout", "30s");

queryRunner.installPlugin(new BigQueryPlugin());
queryRunner.createCatalog("bigquery", "bigquery", connectorProperties);
Expand Down