Skip to content

Releases: temporalio/temporal

v1.20.0

17 Feb 21:47
b313b7f
Compare
Choose a tag to compare

Release Highlights

Upgrade action

  • SERVICE_FLAGS environment variable for docker images is removed, use SERVICES instead.
  • The dynamic config setting matching.useOldRouting was removed and old routing is not available anymore. If you’re upgrading and did not set it earlier, set it to false across your cluster before the upgrade to minimize task dispatch disruption during deployment.

Deprecation heads-up

The following softwares will not be supported in future versions:

  • MySQL 5.7: planned to be removed by v1.21; please upgrade to MySQL 8.0.17+.
  • PostgreSQL 9.6 to 11: planned to be removed by v1.21; please upgrade to PostgreSQL 12+.

Advanced visibility

We’re launching advanced visibility features for SQL databases (see more below). However, this required a significant change in the existing functionality when used with Elasticsearch. Up until v1.19, when you create a custom search attribute of a given type, say Keyword, you could actually store a list of Keyword values. This will no longer be supported. If you have a search attribute of type Keyword, you will only be able to store a single Keyword value. If you wish to store a list of values, you can create a search attribute of type KeywordList. This is the only type that supports list of values, ie., there’s no equivalent for any other type (Int, Datetime, etc.).

$ temporal operator search-attribute create --name CustomField --type KeywordList

If you are using Elasticsearch, you can change the type of a Keyword search attribute to KeywordList type by re-creating the search attribute (warning: this will only work as expected if the value stored is already a list; otherwise, it will lead to unexpected errors):

$ temporal operator search-attribute remove --name CustomField
$ temporal operator search-attribute create --name CustomField --type KeywordList

For backwards compatibility, if you’re using Elasticsearch, list of values is still supported for all search attribute types. However, we discourage such usage. Furthermore, future releases of SDK will deprecate API accepting list of values, and change it to be strong typed so it won't accept list of values.

Advanced visibility on SQL DB (beta testing)

Advanced visibility features are now supported using one of the supported SQL databases (MySQL 8.0.17+, PostgreSQL 12+, and SQLite):

  • Per namespace, you can create up to 3 custom search attributes for each of the types Bool, Datetime, Double, Int, Text and KeywordList, and up to 10 custom search attributes of type Keyword.
  • You’ll be able to call the List API using SQL-like queries (ORDER BY and LIKE operators are not supported, check the documentation for more details).

Here are the steps to enable this feature [warning: for all commands below, please be aware of replacing the database parameters (user, password, host, port) accordingly to your use case]:

  1. Upgrade Temporal Server to v1.20. If you don't want to enable advanced visibility features, then stop here. Everything should work as usual.

  2. Upgrade your database to MySQL 8.0.17+ or PostgreSQL 12+ if you’re using an older version. Please check with official documentation on how to upgrade MySQL and PostgreSQL. We also recommend making a backup of your data before taking any actions.

    • [Optional step for MySQL users] If you upgraded from MySQL 5.7, convert the database character set to utf8mb4(learn more about character sets in the MySQL documentation). The following commands are an example of how you can convert both the temporal and temporal_visibility databases (check the MySQL documentation for more details on changing the character set for databases and tables):

      $ DB=temporal; ( echo 'SET foreign_key_checks=0; ALTER DATABASE `'$DB'` CHARACTER SET utf8mb4;'; mysql $DB -u root -proot -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8mb4;' ) | mysql $DB -u root -proot
      $ DB=temporal_visibility; ( echo 'SET foreign_key_checks=0; ALTER DATABASE `'$DB'` CHARACTER SET utf8mb4;'; mysql $DB -u root -proot -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8mb4;' ) | mysql $DB -u root -proot
  3. Update the schema (if you need to set TLS cert, check our documentation for more details on the temporal-sql-tool command):

    • MySQL 8.0.17+:

      $ ./temporal-sql-tool --ep localhost -p 3306 -u root -pw root --pl mysql8 --db temporal_visibility update-schema -d ./schema/mysql/v8/visibility/versioned
    • PostgreSQL 12+:

      $ ./temporal-sql-tool --ep localhost -p 5432 -u temporal -pw temporal --pl postgres12 --db temporal_visibility update-schema -d ./schema/postgresql/v12/visibility/versioned
  4. Change the sql.pluginName config to mysql8 or postgres12 accordingly.

  5. Restart Temporal Server.

If you have previously created custom search attributes while using standard visibility, you’ll need to create them again. Do not delete the existing ones, just execute the command to create a new custom search attribute. For example, if you had created a custom search attribute named CustomField of type Keyword, then execute the following command (make sure you update to the latest version of Temporal CLI tool):

$ temporal operator search-attribute create --namespace default --name CustomField --type Keyword

For SQLite users, we do not support schema updates. Therefore, if you're using SQLite file-based storage, you'll need to reset your database from scratch.

Archival feature

We reimplemented the Archival feature using a separate task queue, providing higher efficiency, better resource isolation, and strong durability guarantees. Most importantly, Archival will now infinitely retry with an exponential backoff function before deleting any data. Before this change, Archival would only retry for a maximum of five minutes.

Workflow Update (alpha)

Workflow Updates enable gRPC clients to issue requests to workflow executions that mutate internal workflow state and return a response.

  • This is an alpha quality release that must not be used for production workloads. Only workers using the Go SDK are supported. Workflow histories that include update events from this release may not be replayable by future releases.
  • Workflow Updates are disabled by default. To enable UpdateWorkflowExecution API change frontend.enableUpdateWorkflowExecution dynamic config value to true.
  • There is a new example in the samples-go repository demonstrating both how to invoke and how to handle updates.

Replication between clusters with different shard counts

Temporal Server 1.20 added capability to connect and replicate between 2 clusters with different shard count. The requirement is one cluster’s shard count is another one’s multiple (Y=NX, Y and X are shard count, and N is a natural number). Shard count is a scale unit, cluster could run into scale limit if the choose shard count is too small. This new feature make it possible to move live workload to a new cluster with larger shard count. It works the other way as well if you want to reduce your shard count.

Note: Temporal server 1.19 and earlier versions require 2 clusters to have exact same shard count if they want to be connected.

BatchOperation to support DeleteWorkflow

Add delete workflow as another supported action to BatchOperation API.

Otel 0.34

Open telemetry upgraded to 0.34. 0.34 Otel release contains a breaking change. All counters will be append a ‘_total’ suffix.

Default authorizer

If you're using the default authorizer (note that the default authorizer is not used by default, you have to explicitly configure it), you should be aware of a few changes:

  • The special case for temporal-system namespace was removed. This namespace is used by various workflows that run on the system worker service.

    Separately, two recent features, Schedules and Batch v2, require the system worker to connect to namespaces other than temporal-system. If your claim mapper was not properly authenticating the system worker service and giving it a System: RoleAdmin claim, then this special case meant that system workflows would work but Schedules and Batch v2 would not.

    Now that this case is removed, your system workers might have trouble connecting to the frontend unless you make some other changes. Either 1) your claim mapper should properly authenticate the system worker and give it a System: RoleAdmin claim, or 2) use Internal Frontend to bypass the claim mapper for internode connections to frontend. See the Internal Frontend section below.

  • The special case for requests with no namespace was removed, and replaced with a check for two specific requests: gRPC health checks, and GetSystemInfo. All oth...

Read more

v1.19.1

14 Jan 06:11
07f758a
Compare
Choose a tag to compare

Release Highlights

This release fixes a few minor bugs related to task processing and ES bulk processor. It also enables a security check feature by default to enforce the task token check to prohibit caller from modifiying namespace info in the task token.

All Changes

2022-12-29 - 6b2e448 - Fix NPE in task channelWeightFn (#3766)
2023-01-12 - 23d8cf9 - Fix ES bulk processor commit timeout (#3696)
2023-01-12 - 7b64aa0 - Remove internal bulk processor retries (#3739)
2023-01-12 - c46fccb - Capture task processing panic (#3779)
2023-01-12 - ea8af50 - Disable eager activities for Python 0.1a1, 0.1a2, and0.1b1 too (#3793)
2023-01-12 - 5237e7f - Prioritize worker pool shutdown (#3795)
2023-01-12 - c221288 - Capture panic in replication task processing (#3799)
2023-01-12 - a8da358 - Fix recover call (#3804)
2023-01-12 - 5e784b4 - Drop task on serialization error (#3803)
2023-01-13 - 96d6b00 - Turn on frontend.enableTokenNamespaceEnforcement by default
2023-01-13 - 95a4094 - Upgrade version to 1.19.1

Details about the v1.19.0 release can be found here.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use tag 1.19.1)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

v1.19.0

01 Dec 21:47
Compare
Choose a tag to compare

Release Highlights

Schema changes

Before upgrading to your Temporal Cluster to release v1.19, you must upgrade your storage schema version to the following:

  • MySQL schema version 1.9
  • PostgreSQL schema version 1.9
  • Cassandra schema version 1.7

Use the appropriate schema upgrade tool to upgrade your schema version.
For details, see https://docs.temporal.io/cluster-deployment-guide#upgrade-server.

Note that schema version is not the same as the storage type (database) version.

Golang version

  • Upgraded golang to 1.19.

tctl

Metric

A major refactoring has been done in the metrics code. Both metrics.Client and metrics.Scope are replaced by metrics.metricsHandler.

Customized metricsHandler is also supported by using temporal.WithCustomMetricsHandler .

Deprecated metrics

All metrics that have the suffix _per_tl are deprecated.

Improvements and fixes

Clean up mutable state when Workflow is closed

When the Workflow is closed, you can trigger a cleanup to remove Memos and Search Attributes from the mutable state. This helps reducing the size of the mutable state for closed Workflows. However, because the cleanup deletes data, it’s best to pair Advanced Visibility with Elasticsearch to ensure that Search Attributes can be retrieved. This feature is controlled by the dynamic config history.visibilityProcessorEnableCloseWorkflowCleanup.

Count limits for pending Child Workflows, Activities, Signals, and cancellation requests

This change adds a new system protection limit for the maximum number of pending Child Workflows, pending Activities, pending Signals to external Workflows, and pending requests to cancel external Workflows.
Operators can set different values for those new limits via the dynamic config. The default is currently 50,000 for each value.

limit.numPendingChildExecutions.error
limit.numPendingActivities.error
limit.numPendingSignals.error
limit.numPendingCancelRequests.error

Maximum number of concurrent pollers

The dynamic config frontend.namespaceCount controls how many concurrent pollers are allowed to connect to Temporal Server. Currently, this limit applies to the total number of pollers; that is, Activity plus Workflow pollers. In v1.19.0, this limit now applies to the number of pollers per type; that is, Activity and Workflow pollers won’t compete with each other for a connection to Temporal Server.

Batch deletion

The batch delete operation is now support in the batch operation API. Use BatchOperationDeletion option in the StartBatchOperationRequest for a batch delete operation.

History task processing

Default implementation

  • Host-level Task scheduler is now enabled by default, meaning the default value of history.[timer|transfer|visibility]ProcessorEnablePriorityTaskScheduler is true.
  • Multi-cursor queue implementation is now enabled by default, meaning the default value of history.[timer|transfer|visibility]ProcessorEnableMultiCursor is true.

Rate limiting

  • Added priority rate limiter for Task processing. By default this rate limiter is disabled but can be enabled by setting history.taskSchedulerEnableRateLimiter to true. The rate limiter should be enabled only when multi-cursor queue implementation is enabled.
  • The rate is controlled by history.taskSchedulerMaxQPS for per-host limit and history.taskSchedulerNamespaceMaxQPS for per-Namespace limit. All queues (timer, transfer, visibility) share the same limit. Please note that deletion Tasks have lower priority when consuming rate limit tokens and do not count against the per-Namespace limit.
  • The default value for the preceding two configurations is 0, which means they will fall back to and use the value specified by history.persistenceMaxQPS and history.persistenceNamespaceMaxQPS.

Metrics

  • task_latency_processing_nouserlatency, task_latency_userlatency, task_latency_nouserlatency, and task_latency_queue_nouserlatency are removed.
  • New task_latency_user metric, which captures the latency for acquiring Workflow lock in a single Task Execution attempt.

History Scavenger

A retention validation on workflow has been added in the history scavenger. This feature is enabled by fault with worker.historyScannerVerifyRetention. A default grace period on this retention validation is 90 days + namespace retention time. The grace period is configurable by worker.executionDataDurationBuffer.

All changes

2022-09-19 - 9703d33 - Post-release: bump version and upgrade dependencies (#3408)
2022-09-19 - 91c31f1 - Fix merge map of payload (#3412)
2022-09-19 - 1088382 - Fix task reschedule policy (#3413)
2022-09-20 - c672c37 - Adds a retryable error for when we try to delete open executions (#3405)
2022-09-20 - d890a2f - Propagate CloseVisibilityTaskId to DeleteExecutionVisibilityTask (#3404)
2022-09-20 - be727b3 - Retry attempts to delete open visibility records (#3402)
2022-09-21 - 8f24d1f - Revert "Retry attempts to delete open visibility records" (#3420)
2022-09-21 - 9dfdf75 - Rename parameters of MergeMapOfPayload (#3418)
2022-09-22 - ff40b89 - Retry attempts to delete open visibility records (#3402) (#3421)
2022-09-23 - c6fb6b8 - Fixes an issue where last-heartbeat time was set to the first event's timestamp (#3361)
2022-09-26 - f4af2d5 - Fix list batch operation to include division (#3431)
2022-09-27 - b4b61ff - Reorder grpc interceptors (#3423)
2022-09-27 - 85f400e - Add postgres es development script (#3429)
2022-09-27 - e3e1cce - Retry attempts to delete open workflow executions (#3411)
2022-09-27 - 769b865 - Add cluster ID into ringpop app (#3415)
2022-09-27 - 6530847 - Use original branch token instead of deserializing as branch info and then re-serializing (#3384)
2022-09-27 - d27ea89 - Fix action metrics (#3434)
2022-09-28 - 8929def - Namespace replication for failover history (#3439)
2022-09-28 - c150474 - Validate structured calendar specs and improve error messages (#3425)
2022-09-28 - e0bbabe - Per-namespace workers should only run on active cluster (#3426)
2022-09-29 - b497033 - Ensure urfave/cli accepts flag values with comma (#3440)
2022-09-29 - 363d4c0 - Log warning only when there is an error in SA size validation (#3443)
2022-09-29 - 97d0ec3 - Supply optionally configured workflow settings as hints (#3442)
2022-09-29 - 47bb155 - Update dependencies and pin otel (#3444)
2022-09-29 - 3815915 - Implement GetAllHistoryTreeBranches for SQL persistence backends (#3438)
2022-09-30 - 3099274 - Fix reset workflow in replication reapply (#3449)
2022-10-03 - 175b916 - Use safer default TIMESTAMP for MySQL. (#3424)
2022-10-04 - a86d455 - Index history size when workflow closes (#3451)
2022-10-05 - fc37cd4 - Update ns version history in handover state (#3456)
2022-10-06 - 0ae4cc5 - Add config filter by task type (#3455)
2022-10-06 - 0df719d - Update replication use branch token (#3447)
2022-10-10 - fa51f5d - Move record child workflow completed to api package (#3350)
2022-10-10 - 62cd143 - Move verify child workflow completion recorded to api package (#3351)
2022-10-10 - 71a1de3 - Use separate metric for resource exhausted error in task processing (#3463)
2022-10-10 - bc451db - Compare task type enum (#3464)
2022-10-10 - c6b472e - Fix exclude tags with withTags method (#3466)
2022-10-10 - 158737a - Remove old logic for checking workflow deletion task dependencies from delete_manager (#3427)
2022-10-10 - d33559a - Properly handle min task ID > max task ID case during shard re-balancing (#3470)
2022-10-11 - 0e70cf7 - Move get / poll mutable state to api package (#3467)
2022-10-11 - 7278168 - Move describe workflow to api package (#3469)
2022-10-11 - 0dddd3c - Fix timer task visibility timestamp for workflow refresh (#3460)
2022-10-11 - 429c0af - Move replication APIs to api package (#3472)
2022-10-11 - d90f3ab - Move NDC logic to ndc package, move workflow reset to api package (#3465)
2022-10-11 - 740c7a3 - Move reapply events to api package (#3476)
2022-10-11 - d252656 - Move remove signal from mutable state to api package (#3475)
2022-10-11 - b292c22 - Move delete workflow to api package (#3473)
2022-10-11 - 32ea0bf - Move refresh workflow to api package (#3477)
2022-10-12 - 16184da - Fix scheduled queue max read level update for single processor (#3474)
2022-10-12 - ceae241 - Handle sync workflow state task in replication dlq handler (#3482)
2022-10-13 - bf8e1f1 - Bump UI to v2.7.0 (#3480)
2022-10-13 - 46dcb4f - Turns on the history scavenger for SQL backends (#3462)
2022-10-13 - 86fde8e - Move query workflow to api package (#3486)
2022-10-13 - ef21405 - Fix sanitize mutable state after replication (#3479)
2022-10-13 - 457eb05 - Clean up duplicate empty task id (#3490)
2022-10-13 - a0ea177 - Add execution scavenger for retention (#3457)
2022-10-15 - a4fe3a7 - Remove UI v1 from development environment (#3485)
2022-10-17 - dc84e87 - Update replication timestamp with no task (#3487)
2022-10-17 - 8e5ccb5 - schama -> schema (#3501)
2022-10-17 - b8d46a5 - Do not add version 0 to failover history (#3483)
2022-10-17 - cebbde9 - Sync proto API 1.12.0 (#3500)
2022-10-17 - 968c576 - Create visibility GetWorkflowExecution API (#3488)
2022-10-18 - ecde543 - Remove now parameter from task generator interface (#3478)
2022-10-18 - 9cb646b - Rewrite mysql PaginateBranchesFromHistoryTree query (#3509)
2022-10-18 - 8a7538a - Warning log on new event during set workflow (#3508)
2022-10-19 - 8d645e8 - Increase visibility RPS an...

Read more

v1.18.5

17 Nov 21:20
Compare
Choose a tag to compare

Release Highlights

This release fixes minor bugs in history scavenger, gocql client recovery logic, and internal queue processing.

All Changes

2022-11-15 - 00bb513 - Update version to 1.18.5
2022-11-15 - f103b19 - Fix history scavenger bug on delete mutable state (#3588)
2022-11-15 - ec332ef - Wrap gocql.Iter with custom wrapper (#3577)
2022-11-15 - 077c5e0 - Fix resilient shard (#3584)
2022-11-15 - 0f68d26 - Queue processor handle shard ownership lost (#3553)

Details about the v1.18.0 release can be found here.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use tag 1.18.5)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

v1.18.4

01 Nov 02:04
@dnr dnr
Compare
Choose a tag to compare

Release Highlights

This release fixes a few minor bugs and upgrades the gocql dependency. It also upgrades the expat package in the admin-tools Docker image for CVE-2022-40674 (this doesn't affect the Temporal server itself).

All Changes

2022-10-27 - 0616c5c - Port util.CloneMapNonNil to release branch (#3530)
2022-10-31 - 327d46e - Upgrade gocql dependency to v1.2.1
2022-10-31 - 69e2bb9 - Change execution scavenger to call admin delete (#3526)
2022-10-31 - c6cfe6d - Fix replication task nil case (#3531)
2022-10-31 - 26782e9 - Add retention validation in history scavenger (#3541)
2022-10-31 - fe6a873 - Take rpc address from config for local cluster (#3546)
2022-10-31 - 03d6c03 - Update version to 1.18.4

In docker-builds:

2022-10-31 - 85d405a - Use larger GitHub-hosted runners for Docker builds (# 74)
2022-10-31 - e85d0cc - Upgrade expat in admin-tools (# 75)
2022-10-31 - 07e4b80 - Update temporal submodule for branch release/v1.18.x

Details about the v1.18.0 release can be found here.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use tag 1.18.4)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

v1.18.3

20 Oct 19:48
@dnr dnr
Compare
Choose a tag to compare

Release Highlights

This release fixes a bug that was introduced in 1.18.2 that would prevent the server from starting in a single-node configuration.

All Changes

2022-10-19 - 5709090 - Fix scanner start dep (#3513)
2022-10-20 - c0eb02c - Update version to 1.18.3

Details about the v1.18.0 release can be found here.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use tag 1.18.3)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

v1.18.2

19 Oct 01:56
@dnr dnr
Compare
Choose a tag to compare
v1.18.2 Pre-release
Pre-release

This release has a bug that prevents server startup when running in a single-node configuration. Please use v1.18.3 instead.

Release Highlights

This patch releases fixes a few small bugs in namespace migration.
It also has two changes to scavengers: the history scavenger is now enabled on SQL persistence, and the execution scavenger now finds mutable state that's past its retention timeout.

All Changes

2022-10-17 - 1b8e28f - Implement GetAllHistoryTreeBranches for SQL persistence backends (#3438)
2022-10-17 - 2127976 - Properly handle min task ID > max task ID case during shard re-balancing (#3470)
2022-10-17 - d602bff - Fix timer task visibility timestamp for workflow refresh (#3460)
2022-10-17 - 8cd481a - Fix scheduled queue max read level update for single processor (#3474)
2022-10-17 - 45bd55d - Turns on the history scavenger for SQL backends (#3462)
2022-10-17 - bb7b1f4 - Fix sanitize mutable state after replication (#3479)
2022-10-17 - ff47b2c - Add execution scavenger for retention (#3457)
2022-10-17 - 78366f1 - Update replication timestamp with no task (#3487)
2022-10-17 - b57b930 - Do not add version 0 to failover history (#3483)
2022-10-18 - 9a427c2 - Rewrite mysql PaginateBranchesFromHistoryTree query (#3509)
2022-10-18 - 07ab3e1 - Warning log on new event during set workflow (#3508)
2022-10-18 - 0aa5948 - Update version to 1.18.2

Details about the v1.18.0 release can be found here.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use tag 1.18.2)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

v1.18.1

11 Oct 00:14
@dnr dnr
Compare
Choose a tag to compare

Release Highlights

This patch release fixes a few minor issues with search attribute/memo upsert, task rescheduling, schedule api validation, replication, and batch operations. We recommend that everyone upgrade to it.

All Changes

2022-09-28 - 0603535 - Fix merge map of payload (#3412)
2022-09-28 - ac2593c - Fix task reschedule policy (#3413)
2022-09-28 - ca5d0fd - Reorder grpc interceptors (#3423)
2022-09-28 - 88d6fc9 - Validate structured calendar specs and improve error messages (#3425)
2022-09-28 - 02e2edf - Per-namespace workers should only run on active cluster (#3426)
2022-09-28 - 3c0105b - Fix list batch operation to include division (#3431)
2022-09-28 - adc4cc8 - Fix action metrics (#3434)
2022-09-28 - 70ca71b - Namespace replication for failover history (#3439)
2022-09-29 - f84be21 - Log warning only when there is an error in SA size validation (#3443)
2022-10-10 - 082713b - Fix reset workflow in replication reapply (#3449)
2022-10-10 - 1e46237 - Add config filter by task type (#3455)
2022-10-10 - 2f6df41 - Use separate metric for resource exhausted error in task processing (#3463)
2022-10-10 - e2e7474 - Update replication use branch token (#3447)
2022-10-10 - f089fda - Update ns version history in handover state (#3456)
2022-10-10 - cb2f574 - Compare task type enum (#3464)
2022-10-10 - 78a5e36 - Fix exclude tags with withTags method (#3466)
2022-10-10 - 1a6bae2 - Prepare 1.18.1 release

Details about the v1.18.0 release can be found here.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use tag 1.18.1)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

v1.18.0

19 Sep 17:43
@dnr dnr
86966c5
Compare
Choose a tag to compare

Release Highlights

Upgrade action item summary

  • Before upgrade
    • If you’re using ES v6: do not deploy this version until after moving to ES v7 or later
    • If you’re using ES v7+: upgrade schema to include new builtin search attributes (see Schedules section below)
    • If you used the experimental Schedules feature in 1.17: delete all schedules and recreate them after upgrading
  • After upgrade
    • Change dynamic config matching.useOldRouting to false on all nodes as close in time as possible. This will cause temporary matching disruption during the propagation.
    • Consider enabling host-level worker pool and multi-cursor task processing (see Task processing section below)
    • Consider setting persistence rate limits (see Persistence rate limiting section below)

Elasticsearch v6 support is removed

Elasticsearch v7 became the default supported version in 1.12.0 release. In 1.18.0, Elasticsearch v6 support is completely removed. If you are still using Elasticsearch v6, don't upgrade to 1.18.0. Upgrade Elasticsearch first following the migration guide.

Along with v6 support removal, we added Elasticsearch v8 support. Elasticsearch v8 doesn’t have breaking changes that affect Temporal.

Batch API

This release introduces new batch operation APIs in the Frontend Service. The feature is enabled by default.

  • StartBatchOperation: Start a batch operation.
  • StopBatchOperation: Stop a running batch operation.
  • DescribeBatchOperation: Get detail information about a batch operation.
  • ListBatchOperations: List all the batch operations.

By default, the APIs support one concurrent operation per Namespace. This value is configurable by dynamic config: frontend.MaxConcurrentBatchOperationPerNamespace.

Use DescribeBatchOperation or metrics batcher_processor_requests and batcher_processor_errors to monitor the progress of batch operations.

Cluster API

The following Cluster APIs moved from the admin service to the operator service. Those APIs will be deprecated in the admin service in a future release.

  • AddOrUpdateRemoteCluster: Add or update a connection configuration to a remote Cluster.
  • RemoveRemoteCluster: Remove a connection to a remote Cluster.
  • ListClusters: List the configuration of all connected Clusters.

Task processing

Host level task scheduler

  • Host level task scheduler now uses separate task channels for each namespace for better resource isolation.
  • Deprecated the dynamic config history.[timer|transfer|visibility]TaskHighPriorityRPS.

Multi-cursor queue

New multi-cursor queue implementation for better resource isolation and handling cases not covered by host-level task scheduler: too many pending tasks, stuck queue ack level, etc.

  • The new implementation can be enabled by setting the value of dynamic config history.[timer|transfer|visibility]ProcessorEnableMultiCursor to true.
  • The host-level worker pool for the corresponding queue should also be enabled, otherwise the above dynamic config won’t take effect. This can be done by setting history.[timer|transfer|visibility]ProcessorEnablePriorityTaskScheduler to true.

Task Loading

  • The dynamic config history.[timer|transfer|visibility]ProcessorMaxPollHostRPS can be used to limit the throughput of the queue processor. This is very useful for recovering from a persistence outage which leads to a large task queue backlog in persistence. Set the value to a small number and gradually increase to ensure a smooth draining of the backlog. By default, value for this config is 0 and will fallback to 30% of the history.persistenceMaxQPS for transfer and timer queue and 15% for visibility queue.

Metrics

New task processing related metrics are added for better visibility.

  • task_latency_load: measures the duration from task generation to task loading (task schedule to start latency for persistence queue).
  • task_latency_schedule: measures the duration from task submission (to task scheduler) to processing (task schedule to start latency for in-memory queue).
  • queue_latency_schedule: measures the time it takes to schedule 100 tasks in one task channel in the host level task scheduler. If there are less than 100 tasks in the task channel for 30s, the latency will be scaled to 100 tasks upon emission. NOTE: this is still an experimental metric and is subject to change.

Retry behavior

Consolidated retry logic for APIs calls to two places: service handler (by an interceptor) and service client for calling other temporal services or persistence. Limited retry maximum attempts and removed retry logic in other places to avoid potential retry storm.

Persistence rate limiting

Persistence layer rate limiter prioritizes user requests (e.g. start workflow, signal workflow, etc.) over system background requests (e.g. task loading, requests incurred by task processing, replication, etc.)

Persistence max QPS for each namespace can be set by tuning the dynamic config [frontend|history|matching|worker].persistenceNamespaceMaxQPS. By default, this value is 0 and will fallback to the overall host persistence QPS set by [frontend|history|matching|worker].persistenceMaxQPS.

Persistence metrics persistence_requests, persistence_latency and persistence_error* now also contains namespace tag for better observability.

UpsertMemo

This release adds UpsertMemo to modify existing workflow memo. It works similarly to UpsertSearchAttributes. However, on the visibility side, the memo is updated only when used with advanced visibility (i.e. Elasticsearch). When used with standard visibility (SQL databases), the memo is not being updated currently, and the workflow memo can be retrieved correctly only from the mutable state (that is, you can call DescribeWorkflowExecution to retrieve it).

Matching routing

Task queues are now distributed among matching nodes based on namespace and type, for better load balancing. This change is disabled by default to avoid disruption during the upgrade. After upgrading to 1.18, you should make a dynamic config change on all nodes simultaneously to set matching.useOldRouting to false. You can use the following snippet:

# Remove this after upgrading to 1.19:
matching.useOldRouting:
- value: false

This change will cause a short disruption to Task dispatch as nodes reload the dynamic config and then Task Queues get moved between matching nodes. You’ll also see some persistence errors in logs for a few minutes.

We plan to make this routing the default in 1.19, so if you don’t make the dynamic config change, it’ll happen during the next upgrade.

Schedules

A few incompatible changes were made to the Schedules feature, which was introduced as experimental in 1.17. If you have created any Schedules, you should delete them before upgrading to this release, and recreate them after the upgrade.

The Schedule feature is now enabled by default.

If you're using Advanced Visibility (i.e. Elasticsearch), Schedules don’t appear in Workflow lists anymore. If you’re not using Elasticsearch, Schedules continue to appear in Workflow lists for now.

There is one new builtin search attribute to support Schedule visibility. You can add it to Elasticsearch by running the upgrade script from the server repo. See the top of the script for environment variables to use to point it at your Elasticsearch. You can do this before or after the upgrade, but before creating any Schedules. (If you didn’t perform the v2 upgrade when upgrading to 1.17, which also added search attributes for Schedules, also do that now.)

schema/elasticsearch/visibility/versioned/v3/upgrade.sh

Frontend connections

Update: these release notes previously described a new method of making internal frontend connections. This new method works in many, but not all, server configurations (specifically related to mTLS and custom authorizers). To make it work for all configurations, we're going to make some more changes to the feature. In the meantime, there's no need to change any configuration. (If you've already made the config change that was suggested here and it's working, there's also no need to change it back.)

Maximum retention

The maximum Namespace retention limit of 30 days is removed. Namespaces now can use any retention as needed, as long as the persistence has enough capacity for the storage.

Dynamic config interface

The dynamicconfig.Client interface was changed and simplified.

If you’re using Temporal with our pre-build container images or binaries, there’s nothing to do.

If you’re building your own Temporal binary and only refer to dynamicconfig.Client or dynamicconfig.NewFileBasedClient in your ServerOptions, you should be able to rebuild with no code changes.

If you’ve written a custom dynamic config implementation, you’ll need to change it to the new interface. This should be pretty straightforward, but if you have any questions, please contact us; we can help.

All changes

2022-06-21 - 231655c - Rename queryTermination to queryCompletion (#3000)
2022-06-22 - 73881a3 - Prepare for 1.18 release (#3009)
2022-06-22 - f86b8d2 - Per-service fx-ified OTEL tracing (#2896)
2022-06-22 - c7d831b - Explicitly specify timezone for TIMESTAMP values (#3012)
2022-06-22 - 958bde7 - Update mysql image version to support arm64 (#3013)
2022-06-23 - 6cab7e5 - Add developer do...

Read more

v1.17.6

15 Sep 17:40
Compare
Choose a tag to compare

Release Highlights

This release includes a fix for an issue where a task queue could fail to process backlogged tasks if it's loaded or reloaded while persistence is unavailable.

All changes

2022-09-13 - 0fcc1b6 - Prepare 1.17.6 patch
2022-09-13 - f4b94d2 - Move build proto dependencies to separate go.mod (#3377)
2022-09-13 - 6e592f6 - Wait to acquire lease in matchingEngine (#3033)

Details about v1.17.0 release can be found here.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use tag 1.17.6)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools