Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

External client configuration proposal #94

Closed
wants to merge 4 commits into from

Conversation

cretz
Copy link
Member

@cretz cretz commented Aug 15, 2024

⚠️ The round of review in this PR is closed and a new round has started at #95

Summary

  • This is the proposal for external client configuration (i.e. config files and profiles)
  • See it rendered
  • This is the first round of review
  • Like other proposals, all feedback welcome. If the PR discussion gets too large or the review round comes to an end w/ more review needed, it will be closed and a new one will be created referencing this one.

* The format is a JSON object with the following fields:
* `profiles` - Object with keys as profile names. The default profile should be named `default`.
* `<name>` - Profile object with configuration values. Profile names are case-insensitive. 💭 Why? They are needed
for environment variables which are case insensitive. It is a validation error to have a config file with two

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for environment variables which are case insensitive. It is a validation error to have a config file with two
for environment variables which are case insensitive on Windows. It is a validation error to have a config file with two

Copy link
Member Author

@cretz cretz Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our use case, environment variable names are considered case insensitive everywhere (see the next section).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree we want the same behavior across all platforms. I have a hybrid Mac/Windows/Linux environment at home and I've lost count of the number of headaches I've had to deal with because one filesystem is case-sensitive and another isn't.

* `clientKeyPath` - File path to mTLS client key. Mutually exclusive with `clientKeyData`.
* `clientKeyData` - Key data. Mutually exclusive with `clientKeyPath`.
* `serverCaCertPath` - File path to server CA cert. Mutually exclusive with `serverCaCertData`.
* `serverCaCertData` - CA cert data. Mutually exclusive with `serverCaCertPath`.
Copy link

@dandavison dandavison Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Camel case is getting awkward and hard to predict here (people might think it's "CA"). Is there a strong reason for it? server-ca-cert-data, api-key, client-cert-path look nice to me.

Copy link
Member Author

@cretz cretz Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is proper JSON formatting of key names. And especially for our case, since it will be proto JSON, there is no real way to do hyphens well. Many users probably won't use the file directly, but if we are concerned with how this looks in the CLI, maybe the CLI can translate to hyphens in either direction? It is important that any user with any lang can just read this in as proto JSON and immediately have a well-typed object to mutate, and we can't get that with hyphens.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If hyphens don't work well then let's consider underscores. This is of course subjective and up to preference, but it's a design decision we have to make. I don't usually see camel case in config files -- can you give examples?

Yes, this is proper JSON formatting of key names.

Not sure what you meant by "proper".

Copy link
Member Author

@cretz cretz Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you meant by "proper".

JSON formatting, specifically for protobuf in our case but as a best practice for JSON in general, expects camel case key names.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I was certainly about to respond with "Citation?" :) I do see that in Google's style guide and some others, which I'd forgotten. But there are multiple entire language communities that don't use camel case for anything -- there's no blanket statement to be made about JSON keys.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even beyond best practice, in this case I'm specifically proposing using a proto model to define the contract for the configuration, which has strict JSON expectations: https://protobuf.dev/programming-guides/proto3/#json.

all-sdk/external-client-configuration.md Outdated Show resolved Hide resolved
* `serverName` - Override SNI name when connecting to server.
* `disableHostVerification` - Boolean to disable verifying TLS server host. May not be available in all Temporal
clients. 💭 Why? Not all SDKs offer this feature today.
* `codec` - Remote codec information to use for all encoding/decoding of payloads. May not be available in all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call this remoteCodec to be specific

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have chosen to call this codec in CLI args which this is (kinda) modeled off of. I think if we want to call this external codec argument "remote codec" we should do it there too.

* `endpoint` - Endpoint to the remote codec.
* `auth` - Authorization header for the remote codec.
* `grpcMeta` - Object representing HTTP headers. Values must be strings for now (can discuss arrays later).
❓ Should we call this `headers`? We do in some cases and not others.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 I kinda lean towards headers since it's maybe slightly more what people are used to, but, meta is more accurate

Copy link
Member Author

@cretz cretz Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am leaning there as well, but I would need to suggest a CLI update for that too I think for consistency. We are inconsistent in our SDKs on this part too.

Comment on lines +68 to +69
* Because YAML and TOML and others require a dependency in our SDKs. AWS for example has a homegrown INI-ish
implementation they have had to build in to every language.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth seeing if all of our SDKs actually already include some YAML/TOML dep transitively. If we do, I'd say let's go for one of those, but, JSON is also fine.

Copy link
Member Author

@cretz cretz Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can say I know multiple don't. In fact, it's even worse, not all SDKs even have a JSON parser. The only thing they all have is protobuf (which includes protobuf JSON). Here's my understanding:

  • Go - Proto yes, JSON stdlib, YAML yes transitively (stupid testify), TOML no
  • Java - Proto yes, JSON no (in serviceclient where we need it), YAML no, TOML no
  • TypeScript - Proto yes, JSON stdlib, YAML doubtful, TOML doubtful
  • Python - Proto yes, JSON stdlib, YAML no, TOML no
  • .NET - Proto yes, JSON stdlib, YAML no, TOML no

Sure we can have Core bake in these things for the last 3, but Java is still hurt here, and I think proto JSON works well here anyways.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suggested above, we should consider vendoring also, so it would be up to the SDK/CLI whether they wish to vendor or use a dependency.

To make it concrete, here's an example of what it might look like in TOML. Not supporting multiple profiles, since whether or not we want that is a separate question that we're discussing. As I've said above, I think there's a strong argument for single-profile per file.

[client]
address = "localhost:7233"
namespace = "default"
api_key = "<my-key>"
codec = { auth = "<my-headers>", endpoint = "https://my-codec-server.com" }

[client.tls]
client_cert_path = "/path/to/certs/client/cert.pem"
client_key_path = "/path/to/certs/client/key.pem"
server_ca_cert_path = "/path/to/certs/ca/cert.pem"

[client.grpc_meta]
my_grpc_header_1 = "some-value"
my_grpc_header_2 = "some-value"

Comment on lines +72 to +73
* ❓ Should we consider "json with comments" and some kind of very basic pre-parser in each language that strips
them? This could come later and not part of MVP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so because then we can provide a template. We could also provide a JSON schema def

Copy link
Member Author

@cretz cretz Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, can come later as I don't expect anyone to hand edit these anymore than they do AWS config files. We'd have to hand-write the JSON comment remover but it's not a big deal.

Comment on lines +139 to +141
* There's no obvious env var approach to array of strings, and it can't be completely comma-delimited because header
values often contain commas. Would rather not force users to have an index in the env var name. Should we just not
support gRPC meta from env var until we think this through?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could just pass an escaped json object? I'm fine w/ delaying that until later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that can work but is a bit rough, might be better to just put header name in env var, treat env var underscores as hyphens in header name, and lowercase them all since gRPC and HTTP/2 do that anyways.

Possible language-specific ideas are below. Note, they are not exact and the naming is all subject to change, they are
simply ideas for discussion. Many of the approaches below are based on several rewritings after seeing how they don't
work in some SDKs. The overall approach is to just make client and connection options loadable from configuration
without overthinking shortcuts or merging or hierarchies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not mentioned here but the actual loading and parsing of the file in Core SDKs should be done by Core I think, so that we can centralize:

  • Logic of default profile / file location
  • Merging of env vars
  • Read/write the files

Since it's already a proto core can just return that after loading / merging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but I don't think there's a writing step to centralize, and I still expect users to be able to use the protos themselves if they want. And I think people want to manipulate the results, so I think we should centralize all of those things in Core, but not the conversion of the proto result to the client options, that should remain in lang.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agree


* Common format for declarative workers, schedules, etc. This is only for client options.
* File-based hierarchy. We don't need many levels of file merging for these options at this time.
* Extreme focus on human authoring of this format. This is because it'd likely go against the "0 new dependencies" rule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prioritize authoring DX / YAML over 0 deps

Copy link
Member Author

@cretz cretz Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really bad for libraries to force a dependency on all users of a library just for an optional side thing like this, that's really rough DX (we already suffer it so much with proto versions and that's an actual needed dependency). It's not just avoiding taking on a dependency, it's not applying the dependency's version constraints transitively to a user's entire project that may use these very common libs themselves. We do have a precedent of making completely separate Go modules, Java JARs, TypeScript packages, Python extras, and .NET extensions in some cases though, but not usually for something so small.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the scenarios Candace highlighted is for folks who are migrating production and dev environment code to cloud. (We should align on whether that's a goal (P0/P1/P2)).

For such a use case, we might want to provide--and/or allow users to provide--extensions that allow configuration in whatever their system is, whether yaml or otherwise.
Also, if it's opt in as you say below, I don't see taking dependencies as a P0 problem.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, please reword this. What "extreme focus on human authoring of this format" means is in the eye of the beholder. Might be better to put something about human authoring features as a goal with a lower priority. Then we can align on degrees of importance rather than stark goal/non-goal status.

Copy link
Member Author

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if it's opt in as you say below, I don't see taking dependencies as a P0 problem.

I think you're confusing opt-in from a caller POV and opt-in from a dependency POV. Someone can depend on our library and still have to opt-in to calling something on it, but just the act of depending on the library brings in dependencies. Dependencies are not opt-in, only the use of the call.

Might be better to put something about human authoring features as a goal with a lower priority. Then we can align on degrees of importance rather than stark goal/non-goal status.

The idea of the non-goal was to be a never priority. For this project there just happened to be no "future goals" like there are in other projects because meeting the goals is so simple as to not have any. In this case, they are meant to be P∞, heh. I will clarify that there are no future goals identified at this time. I will also remove the "Extreme focus on human authoring of this format" non-goal and clarify the goal for advanced manual configuration editing.

EDIT: Removed this line about human authoring, clarified in the goals that the configuration can be manually edited, added "Future Goals" section, and added section for how manual configuration creation/editing can be done by users.

Copy link

@josh-berry josh-berry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this took some time to get to.

* Common, well-specified configuration format usable by Temporal tooling for client options
* Implementation in all Temporal SDKs for reading and writing of this configuration
* All Temporal CLI/SDKs updated to have easy client creation using this configuration
* Common on-disk location in every platform for reading/writing this format

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to be idiomatic per language? e.g. Java probably has its own configuration solution that's different from Python, that's different from .NET, etc.

I know this would be sub-optimal from the standpoint of running samples, so I'm not actually suggesting we go this route, but I do want to make sure we've explicitly considered (and thus written down) the trade-offs.

Copy link
Member Author

@cretz cretz Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to be idiomatic per language? e.g. Java probably has its own configuration solution that's different from Python, that's different from .NET, etc.

Not IMO. Most runtimes actually don't have this that I'm aware of, including Java, Python, .NET, etc. Also, we want the same config to work everywhere. If you're aware of somewhere where this is not idiomatic, can discuss.

Copy link

@drewhoskins-temporal drewhoskins-temporal Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these goals seem more must-have than others. Can we categorize them into P0/P1/P2, and then maybe P3=Non goal?

Copy link
Member Author

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these can be separate as they apply to the system as a whole. I don't think there's a clear split point with these goals that can be done in the future. The MVP is meant to be a very simple approach to accomplishing the goals, and everything else can be done in the future. The only place where I think work can be split is by language. Otherwise, these goals all amount to basically one feature. For this project, due to its simplicity in implementation, there were no "future goals" like there are in other projects. I will add a "Future Goals" section to clarify.

EDIT: Added "Future Goals" section, but it has nothing there right now. But we can add anything we think of (I can't think of anything here).

* Ability to provide environment variables to represent values in the configuration
* Well-specified hierarchy for on-disk overridable by environment variables overridable by in-CLI/SDK values
* 0 new dependencies in SDKs
* Profile names within the configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Can we not achieve the same thing in a simpler fashion by providing a way to set the configuration file path from the environment (for example)?

The only value I can think of for having them is in the CLI itself, where it's convenient to be able to refer to different environments by short names rather than full config file paths. But I'm not sure if that use case needs to bleed into the SDKs.

Your example below of downloading a premade config from the cloud illustrates this point—profiles don't help you here, because your config is in a separate file anyway, and you'd have to go thru extra steps to merge the downloaded config with your local config if you wanted to use it as a profile.

Copy link
Member Author

@cretz cretz Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Multiple profiles in the same configuration makes it easier to edit from CLI and have a single config to ship around with multiple profiles if you want.

Can we not achieve the same thing in a simpler fashion by providing a way to set the configuration file path from the environment (for example)?

We offer this too, but the default is a single place.

The only value I can think of for having them is in the CLI itself, where it's convenient to be able to refer to different environments by short names rather than full config file paths. But I'm not sure if that use case needs to bleed into the SDKs.

CLI and SDKs need to share the same approach here. A single config file that contains multiple profiles helps all clients IMO.

Your example below of downloading a premade config from the cloud illustrates this point—profiles don't help you here

That they don't help you there doesn't mean they don't help you anywhere. That's not a multi-profile use case. Having every profile be a different file I think is a bit rough for multi-profile use cases and sharing of multi-profile configuration. EDIT: I added a clarification on why multi-profile-single-file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only value I can think of for having them is in the CLI itself, where it's convenient to be able to refer to different environments by short names rather than full config file paths. But I'm not sure if that use case needs to bleed into the SDKs.

CLI and SDKs need to share the same approach here.

Why?

The CLI has some use cases/scenarios that don't apply to the (other) SDKs, like being used for operational/admin tasks, possibly in multiple environments.

Your example below of downloading a premade config from the cloud illustrates this point—profiles don't help you here

That they don't help you there doesn't mean they don't help you anywhere. That's not a multi-profile use case. Having every profile be a different file I think is a bit rough for multi-profile use cases and sharing of multi-profile configuration. EDIT: I added a clarification on why multi-profile-single-file.

I am struggling to come up with a "multi-profile use case" that can't also easily be handled with multi-file. Can you provide an example?

Copy link
Member Author

@cretz cretz Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI and SDKs need to share the same approach here.

Why

Because the same configuration needs to work in all Temporal clients (SDKs and CLIs) consistently.

The CLI has some use cases/scenarios that don't apply to the (other) SDKs, like being used for operational/admin tasks, possibly in multiple environments.

Sure, and Java has scenarios that don't apply to Python and so on. But this configuration should apply to both. I think AWS CLI vs SDK is a decent model here. I think consistency should be the default and we need a really good reason to be inconsistent.

I am struggling to come up with a "multi-profile use case" that can't also easily be handled with multi-file. Can you provide an example?

Sure, technically everything you can do with a single file you can do with multi file, so you will struggle if you're trying to see a situation where we must force multi file. Same goes with any config that has separate sections. For instance, you could say you're struggling to come up with a use case why the tls config option can't also easily be handled with a separate TLS file.

The reason we have nested keys in a configuration is for convenience of not having to ship around multiple files or deal with multiple files. You can load up an entire file at once and mutate multiple profiles. You can onboard a new dev by giving a single file with all profiles in it pointing to different clusters. It is easier to glance at all configuration in a file. The entire configuration for use in an SDK can be one set of bytes (some users may not use files, they may provide the bytes some other way). You can load configuration for all profiles from a single database or S3 blob. Etc. The same reasons you may want more than one profile in a file extend to wanting more than one key in a file.

I think justifying not using a separate file for separate keys (whether those keys are profiles or just nested options like tls) is more important. AWS and others put all profiles in a single file and I think we should too. When Temporal says "configuration" it should be a single thing, whether that's a proto object, a set of bytes in memory, a set of bytes on disk, a DB blob, a result of an API call, etc. To have separate profiles be separate objects, separate sets of bytes, separate disk files, etc is a bit rough.

I think you're a bit too hung up on files here, files are just one way bytes are stored. Configuration is loaded from bytes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's classify the contexts in which the config file will be used into deployment contexts (whether prod or staging environments) vs development contexts.

In deployment contexts, a single file with a single profile will probably be what's wanted, with client connection secrets injected via env vars.

Regarding development contexts, consider an organization with a large monorepo containing many different projects that make use of Temporal. The different projects in the monorepo will want the ability to commit independent Temporal config files in their own subdirectories of the monorepo. In this context, the single XDG_CONFIG system config location will not be used, so projects are going to make use of a TEMPORAL_CONFIG env var or other mechanisms to specify their config file location.

So in such a codebase, multiple config files is an inevitable reality. Some questions are:

  1. Will they want upwards file-system traversal and/or hierarchical merge functionality? Perhaps, but we can add that later if we think it makes sense.

  2. Will they want multiple profiles within a single file? They're already using mechanisms to specify the location of their config file, so it's very possible that the answer is "no, they can use different files if they need that".

Open-source projects such as git and uv have chosen:

  • not to support multiple profiles in a single file
  • yes to support upwards file-system traversal to discover the applicable config file
  • yes to support multiple file locations with hierarchical merge.
  • yes to TOML

I think you're a bit too hung up on files here, files are just one way bytes are stored. Configuration is loaded from bytes.

Files are how developers look at profiles: cat /path/to/my/config. Multiple profiles within a single file are hard to read. AWS have solved that by using a configuration language that supports references, but we're not currently considering that option. Multiple profiles in JSON will be very hard to read, and it's well-known that no-one can remember how to use jq :)

Copy link

@dandavison dandavison Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, now we're making progress! So (correct me if I'm getting you wrong), you're basically using "environment config" with the following meaning:

The SDKs and CLI all require certain resources to be specified (e.g. server address), together with things like encryption settings and credentials for those resources. The key point is that these environment choices (kind of) don't change the behavior of the SDK/CLI -- they're about choosing from among a range of possible resource providers.
This is in contrast to "non-environment" configuration settings that change the behavior of the SDKs and CLIs.

And I agree, this is a distinction we should make.

I've given several reasons above to think that non-environment config is something we will want, whether sooner or later: I've given concrete suggestions for what those might be today, and I've also appealed to the fact that Temporal has a large product surface area, ambitions, and a lot of time in which it will exist in the future hence is likely to end up wanting this at some point.

So unless we can identify reasons why we in fact do not think we'll ever want non-environment config, the challenge here is to design file-based configuration in a way that supports the following requirements (but not limited to these):

  1. It is possible to define non-environment config.
  2. It is possible to add new top-level non-environment config sections in the future.
  3. It is possible to define multiple alternative environment profiles.
  4. It is possible to use env vars to supply values for all environment config values where an env var makes sense.

Copy link
Member Author

@cretz cretz Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given concrete suggestions for what those might be today

But I think they should be rejected for config except for environment-specific suggestions, no matter how far in the future non-environment config is suggested. It doesn't mean other avenues for non-environment things can't be explored, I just don't think we need to account for them as part of this config which should be environment config only.

I think we just have to decide if this config is environment-config only or not. I propose so, forever into the future.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't think we need to account for them as part of this config which should be environment config only.

Right, I just don't think we can use the name temporal.{json,toml} if it's never going to have general config in it. So maybe

  • temporal-profiles.{json,toml}
  • temporal-client-config.{json,toml}

Alternatively, perhaps we should be less strict and call it temporal.toml and allow that it might acquire non-environment config in the future, as other top-level keys that are siblings of the profiles key.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear: I don't really like the idea of multiple files. I think users will think of it all as "config", and will be annoyed that Temporal has "multiple different config files", and so it will make the product seem more complicated. So, even though we might like to avoid a difficult design process at this stage, that might not be the right decision for the product.

Copy link
Member Author

@cretz cretz Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change the default filename, no problem. I have no strong preference on which alternative there is chosen. I suspect people won't really use it directly anyways very much because if they want to hand-edit a file they might put it somewhere more reasonable and provide that filename when loading or as an env var. Or they'll use the CLI. Similar to AWS config.

I would be concerned however if we need to change the env var names or property names or the CLI commands or anything. I do think from a user POV we still need to treat this as "the config" and force qualifying "client" in those places.

all-sdk/external-client-configuration.md Show resolved Hide resolved
all-sdk/external-client-configuration.md Outdated Show resolved Hide resolved
Now running the same code again unmodified will run against cloud. Or in a Temporal future with API keys and a
known/fixed API endpoint:

temporal config set --key cloudApiKey --value my-api-key

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yuck, I do not like having cloud and non-cloud config keys. Cloud should just be a different profile that works the same as on-prem IMO. I'm wary of even providing "buillt-in" defaults for cloud because I think they will have more confidence/comfort in being able to see exactly what's different.

If we must have different API keys because of talking to the control plane vs. data plane (or whatever), we should clarify the function here (e.g. controlPlaneApiKey and dataPlaneApiKey or whatever better names we come up with).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yuck, I do not like having cloud and non-cloud config keys

This does not exist today FYI. But it's an example because this is not just a cloud config key, this is saying "default to all the other cloud things".

I'm wary of even providing "buillt-in" defaults for cloud because I think they will have more confidence/comfort in being able to see exactly what's different.

A single known API domain and that TLS defaults to true are reasonable cloud defaults.

If we must have different API keys because of talking to the control plane vs. data plane (or whatever)

Again this does not exist today, and that's not why we need different API keys. The reason we want different config values is because there are things you can assume about cloud that you may not be able to assume about others. Regardless, this is just a sample of something we could do in the future. In the meantime, you can use apiKey (and always will be able to, even for cloud).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not following what you're saying here. Are you saying that setting cloudApiKey (or any other cloud-specific setting) would hypothetically auto-magically say "you're running against cloud now" and change other config values (e.g. apiKey) automatically? Because as a user, that would be extremely surprising to me—I hate it when libraries do that.

Now, having a special option like cloud: true or something, which is just a (well-documented) synonym for {tls: true, address: "cloud.temporal.io"} or whatever would make much more sense (although I personally still wouldn't use it). Just don't give me cloud and non-cloud variants of something like apiKey, because this is a distinction without a difference and will create a ton of user confusion.

Copy link
Member Author

@cretz cretz Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that setting cloudApiKey (or any other cloud-specific setting) would hypothetically auto-magically say "you're running against cloud now" and change other config values (e.g. apiKey) automatically?

Yes

I hate it when libraries do that.

I mean we could have just removed api key and made you hand-write the grpc meta. But we recognize shortcuts are important for ergonomics. We may in the future be able to accept a single value from a user to run on cloud (whether in the CLI, SDK, config, etc) and we should consider it.

Now, having a special option like cloud: true or something

You turned one option into two. What's the difference between cloud: true + apiKey: somevalue and cloudApiKey: somevalue except forcing someone to make two values as if API keys were usable across environments and were some generic thing.

Comment on lines +243 to +246
* 💭 Why deprecate some instead of making all our clients support these variables that exist in CLI?
* It is more sane/reasonable to have a simple to understand config-field-to-env-var format than it is to have these
special environment variables be inconsistent outliers.
* For non-CLI uses (i.e. SDKs), it is unreasonable to burden them with past CLI choices.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I dislike being inconsistent with CLI, which means we will need to deprecate these variables in CLI as well. I think maintaining different variants of different env variables will be confusing and frustrating for users, especially since we cannot do strong validation of TEMPORAL_* env vars.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to be inconsistent with the CLI than let a CLI's inconsistencies affect new code in SDKs IMO. Are you saying we should make all SDKs work with old CLI env vars?

I think maintaining different variants of different env variables will be confusing and frustrating for users

The older ones should probably not be documented. But unless you suggest burdening all SDKs ever with some just-went-GA-CLI env vars, I don't see a better solution. Can you suggest your solution?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to be inconsistent with the CLI than let a CLI's inconsistencies affect new code in SDKs IMO. Are you saying we should make all SDKs work with old CLI env vars?

No I'm not.

Can you suggest your solution?

Your solution is fine but incomplete—all I am saying is we need to deprecate the old variables in CLI so CLI is consistent with everything else.

Copy link
Member Author

@cretz cretz Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all I am saying is we need to deprecate the old variables in CLI so CLI is consistent with everything else.

I agree. The lines you are commenting on in the proposal are specifically there to explain why we are deprecating them.


## Implementation

### CLI

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, we also must deprecate CLI env variables that differ from the new scheme.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this was discussed above (and if we happen to have any other clients that use their own made-up/inconsistent env vars, but luckily I think it's only CLI)

Comment on lines +149 to +151
## Specification

### Values and File Format

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you touch on this briefly elsewhere, but IMO it would be good to explicitly write the (very simple) algorithm showing the hierarchy of how different sources of config are merged, e.g.

Suggested change
## Specification
### Values and File Format
## Specification
### Combining Different Sources of Config
1. Start with a "blank" config with built-in defaults for every value
2. Read the config file (default path [...], or as specified in the `TEMPORAL_CONFIG_FILE` env var), and override any values specified there
3. Read the environment variables, and override any config values specified there
4. Apply explicit config (e.g. flags passed to the CLI, options set explicitly in code in the application)
### Values and File Format

Super rough but hopefully you get the idea.

Copy link
Member Author

@cretz cretz Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, can add. We are starting to bloat this proposal, but I expected it to be the case as separate reviewers need their own additional sections before approval.

EDIT: Added section. It does not match exactly what you have here, but close.

#### TypeScript SDK Idea

* New `client` module function `loadConfig(options?)` where the `options` type is
`{ configFile?: string, disableFile?: bool, disableEnv?: bool }` and it returns a `proto.Config`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I'm trying to stay out of lang-specific discussions, but this one I can't resist. ;)

I understand disableX is quite common in Go, but it's an anti-pattern in almost every other language IME, and is definitely-so in TypeScript. You're better-off doing something like enableFile?: bool and then checking it with if (options.enableFile !== false).

Copy link
Member Author

@cretz cretz Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I agree even in TS. I think it's clearer to have the field name clarify you are changing the default than expecting people to set things as false. Also, this is where you get into confusing per-SDK differences without adding any real value to the difference (the existing approach here is plenty idiomatic).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the TypeScript APIs and I see no disableFoo identifiers anywhere in the API docs—to the contrary, I see things like reuseV8Context which are optional and default to true. Not sure about other languages, but to align with existing conventions in TypeScript, you do need to avoid disableX nomenclature.

Copy link
Member Author

@cretz cretz Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That the TypeScript SDK doesn't have it is not emblematic of whether being able to disable a default behavior via a disable flag is idiomatic. And definitely not good enough justification to make TypeScript unique in the option to disable default things while loading.

I'm ok delegating to the TS owners here, knowing that at least the other SDKs are allowed to have clear disable options.

work in some SDKs. The overall approach is to just make client and connection options loadable from configuration
without overthinking shortcuts or merging or hierarchies.

#### Go SDK Idea

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to explicitly avoid commenting on individual APIs (to the extent I can help myself :D) but I'd generally put in a plug for composability—as an advanced user, I'd want to be able to use individual building blocks like newConfigWithDefaults(), mergeConfig(), readFromEnv(), readFromFile() etc. in addition to "just-works / out-of-the-box" config builders, so I can take just the parts I need to fit into my existing config regime. It would be nice to offer a middle ground between "use our out-of-the-box algorithm for config-building" and "build the config yourself".

Copy link
Member Author

@cretz cretz Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessarily expanding the scope at first IMO...

newConfigWithDefaults()

That's just creating a new Go config

mergeConfig()

We have explicitly avoided any merging logic exposed to users because we do not support merging of configs because earlier versions of this proposal tried and it became too complicated

readFromEnv()

This is what loading while disabling file does

readFromFile()

This is what loading while disabling env does. And regardless, this is why we chose proto JSON. So people can use proto reading, writing, merging, etc.

"build the config yourself"

Yes, this is one of the motivators behind proto JSON. You have a typesafe representation of the config you can build yourself or do anything you want with. EDIT: Added section about manual configuration mgmt.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree this is expanding scope—you have just said we already have 4/5 of the things I'm asking for but presented in a different (IMO less composable) form. And the 5th, mergeConfig(), is something we will have to implement internally anyway if we want consistent behavior when an env variable overrides a file, or when a file overrides built-in defaults. It just needs to be exposed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have just said we already have 4/5 of the things I'm asking for

I was afraid of the "etc" there and other potential things that may be requested before an initial version is delivered

And the 5th, mergeConfig(), is something we will have to implement internally anyway if we want consistent behavior when an env variable overrides a file, or when a file overrides built-in defaults. It just needs to be exposed.

The initial version did have advanced merging logic where you could merge two configs, but this was hard to reason about. Specific variables override specific variables, this is not the same as merging though. With configuration options, especially in programmatic form, there is not a difference between unset and false. So how do you know if someone turned off TLS? Can you write an example implementation of mergeConfig using Go structures or proto objects? You may find it is challenging because merging programmatic objects is not the same as overwriting specifically set values that are specifically provided. The difference is subtle but important IMO. This is why we require that you create a new config from the configuration instead of "apply" configuration to an existing object.

What are the reasons for building this?

* Users want to switch environments without changing their code, but today they cannot.
* Our samples should be able to be run on multiple environments including cloud, but today they are localhost only.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do users care about running samples against cloud? Personally, I've always been fine with localhost

Copy link
Member Author

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For several reasons. For one, I suspect because it's part of demonstrating transition. Also, if we have an environment we promote for easy use, it seems reasonable to make samples run there easily. Also as I understand it, for many "digital natives" (or saas-first or cloud-first or whatever you'd like to call them), they'd go straight to cloud without a local step as they do with many other products they use. Also, running a sample against cloud allows them to confirm they have their cloud namespace setup properly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @cretz.

  • We may want to make it possible for someone to check out the repo in a VSCode cloud workspace and run samples. (E.g. because they like to work that way, or they are working on a non-conventional development environment such as a tablet or phone).
  • We may want to create a playground app where users can run and edit our samples.
  • An organization may be testing out their cloud namespace and want to use samples for that.
  • An organization may be onboarding Temporal users to cloud namespaces and want them to try running samples against their cloud namespace.

Copy link

@drewhoskins-temporal drewhoskins-temporal Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I definitely buy and like some of these! (Though, others are written from our perspective rather than the user's perspective) The meta-point I want to make is that we should incorporate some of these user needs into our goals to make sure we understand priority and are solving their complete needs. (i.e. Let's not just discuss it in a thread but build it into the doc.)

Here are some possible stories in addition to the ones y'all provided.

  • A user figuring out how to use Temporal copies code from these samples which uses these client configs, as well as the config setups themselves, into their production code and expects it to work well.
  • A user has a multi-region setup and wants to vary some of their client connection config depending on the region.
  • User is migrating from self-hosted to cloud and wants to dynamically flip between the two environments, e.g. using a feature flag.

We should err on the side of writing more stories down and then, if we don't want to cater to them, saying so explicitly. This will make sure we're aligned.

* Common, well-specified configuration format usable by Temporal tooling for client options
* Implementation in all Temporal SDKs for reading and writing of this configuration
* All Temporal CLI/SDKs updated to have easy client creation using this configuration
* Common on-disk location in every platform for reading/writing this format
Copy link

@drewhoskins-temporal drewhoskins-temporal Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these goals seem more must-have than others. Can we categorize them into P0/P1/P2, and then maybe P3=Non goal?

* Common on-disk location in every platform for reading/writing this format
* Ability to provide environment variables to represent values in the configuration
* Well-specified hierarchy for on-disk overridable by environment variables overridable by in-CLI/SDK values
* 0 new dependencies in SDKs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is broad and seems like it precludes using any prebuilt configuration packages, which seems like it should probably be part of the discussion. What are you really trying to avoid, how critical is that (P0/P1/P2), and why?
Can there be layering where we have an extension that has dependencies?

Copy link
Member Author

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you really trying to avoid, how critical is that (P0/P1/P2), and why?

Trying to avoid any new dependencies in our libraries that then forces uses to depend on them transitively and locks them into our version constraints. We have seen this issue with the libraries we do have to depend on (e.g. protobuf) where by using our libraries, we are forcing our users' apps to be bound by version constraints we have. Libraries like ours should avoid dependencies they don't need because they are transitively infectious to a user's entire application.

Can there be layering where we have an extension that has dependencies?

Yes, and this is supported in all SDKs. In Python it's just an "extra" you have to enable, but for all others these extensions with extra dependencies are entirely separate libraries they have to depend on. The tradeoff is that they now have to add a separate dependency just for configuration support which I was hoping to avoid. Ideally this is as easy as possible for users to consume, and adding a separate library for them to depend on (or force an extra upon install) is not as easy as not having to do it. To compare with AWS SDK, their configuration format they write parsers for in every language, but we don't have to go that far.

I will add some clarity to the document explaining this. I think the burden is on justifying the addition of a dependency, not maintaining the lack of one.

EDIT: Added clarity on why 0 dependencies and added open question for using extensions instead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not want optional "you can use this configuration format" extensions; I think that would be an unnecessary scope expansion and not good value for the complexity required.

As much as I hate JSON as a config format, I have to agree with Chad that avoiding new dependencies, and avoiding optional extensions for something this small, is probably the correct call so long as we continue to support people who are opinionated enough to roll their own.

Copy link

@dandavison dandavison Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TOML is becoming very popular among respected open source projects for configuration (e.g. Python and Rust have both chosen it). To make sure we've considered all possibilities, let's add vendoring TOML parsers as something to consider. We'd have to keep up with any security patches, but otherwise these should be stable.

The most important thing here is DX when using the config files. And we've got to assume that new settings will be added to these config files over time, as we add new product features. As mentioned elsewhere in the proposal, the lack of comments in JSON is extremely undesirable. And of course, the comma-requirements result in very bad DX for anyone who is editing directly and not using an autoformatting editor-extension for JSON.

Copy link
Member Author

@cretz cretz Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Vendoring is an option.

Another challenge is type-safe contracts. With JSON we have a form of type-safe contract across every SDK - protobuf. It supports nesting, documentation, merging, serialization/deserialization, unhandled field support, requires 0 new dependencies, and there's a precedent w/ our usage of it in similar ways (e.g. workflow history). With TOML, the models will have to be separately maintained in every language. Obviously totally doable, just a series of tradeoffs.

Also, I think it's worth discussing how much we expect configuration to be hand-edited. I don't think many people hand-edit AWS SDK/CLI configuration (I have admittedly been using them as a reference here because there's not a lot of SDK configurations out there). Why do we think more people will hand-edit our SDK/CLI config vs that SDK/CLI config? Are we making blind assumptions here? Also, I do know that vscode and other similar tools have no problem with hand-editing JSON (with comments), is their hand-edited configuration story unacceptable for us?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I do know that vscode and other similar tools have no problem with hand-editing JSON (with comments), is their hadn-edited configuration story unacceptable for us?

IMO JSON works as a choice for VSCode because they wrapped the parser to allow comments and tolerate trailing commas, which we could do, but that's more technical hassle than vendoring a TOML parser. (Also in VSCode they know that many users will have an autoformatter extension such as prettier installed, which makes the experience even better).

Copy link
Member Author

@cretz cretz Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's more technical hassle than vendoring a TOML parser

I am not sure this is true. Remember, vendoring includes throwing our our ability to have protobuf model. But still, I would be ok w/ JSON w/ no comments.

Copy link

@drewhoskins-temporal drewhoskins-temporal Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to avoid any new dependencies in our libraries that then forces uses to depend on them transitively and locks them into our version constraints. We have seen this issue with the libraries we do have to depend on (e.g. protobuf) where by using our libraries, we are forcing our users' apps to be bound by version constraints we have. Libraries like ours should avoid dependencies they don't need because they are transitively infectious to a user's entire application.

This a good flag--let's state that as the goal rather than the implementation choice of not taking any dependencies at all. For example, we could rely on older versions of certain technologies so we're not forcing people to upgrade. Or we could provide a fallback plan so that most users get nice things.

Also, I also if, in some languages, we can make dependencies dynamic and only loaded, for example, when a certain file is present.

+1 that the ability to comment is worth listing among our goals and deciding whether it's a goal.

Copy link
Member Author

@cretz cretz Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a good flag--let's state that as the goal rather than the implementation choice of not taking any dependencies at all. For example, we could rely on older versions of certain technologies so we're not forcing people to upgrade. Or we could provide a fallback plan so that most users get nice things.

No, version constraints go in all directions for compatibility because a newer version may break compatibility in a major version. There is no such thing as "depending on every version ever into the future", you have to have some constraint. If we say "I depend on toml 1.x", when "toml 2.x" comes about our users come complaining to us saying we're holding them back from upgrading. And you can't say "toml x.x" because newer ones may break compat so we have to pick one. And you can't just upgrade to "toml 2.x" when released because many users don't want to upgrade. These are not hypothetical, we've seen all of these complaints and situations in practice. This is why Dan mentioned vendoring (or "shading" if you're a JVM person (kinda)...this has its own issues but still doable, granted it may not have enough value IMO).

It is important for us not to impose constraints on our users unless we absolutely have to. I suppose I can change "0 dependencies" to "0 external library version constraints imposed on our users" if the former is unacceptable as a stated goal, but they are saying the same things.

+1 that the ability to comment is worth listing among our goals and deciding whether it's a goal.

I don't think it is a goal, but that's where this discussion comes in, because if what I think are the limited goals mismatch with what other people think the goals must be, we can reach consensus/decision.


* Common format for declarative workers, schedules, etc. This is only for client options.
* File-based hierarchy. We don't need many levels of file merging for these options at this time.
* Extreme focus on human authoring of this format. This is because it'd likely go against the "0 new dependencies" rule.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the scenarios Candace highlighted is for folks who are migrating production and dev environment code to cloud. (We should align on whether that's a goal (P0/P1/P2)).

For such a use case, we might want to provide--and/or allow users to provide--extensions that allow configuration in whatever their system is, whether yaml or otherwise.
Also, if it's opt in as you say below, I don't see taking dependencies as a P0 problem.

* Common format for declarative workers, schedules, etc. This is only for client options.
* File-based hierarchy. We don't need many levels of file merging for these options at this time.
* Extreme focus on human authoring of this format. This is because it'd likely go against the "0 new dependencies" rule.
* Use external configuration by default. Users need to opt-in to this for compatibility and clarity reasons (but it is

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the SDK, right? Our samples would use it by default?

Copy link
Member Author

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I will clarify. The samples will be written to opt-in to configuration. Also, the CLI will use external configuration by default. EDIT: Clarified.

Comment on lines +68 to +71
export TEMPORAL_ADDRESS=my-ns.a1b2c.tmprl.cloud:7233
export TEMPORAL_NAMESPACE=my-ns.a1b2c
export TEMPORAL_TLS_CLIENT_CERT_PATH=path/to/my/client.pem
export TEMPORAL_TLS_CLIENT_KEY_PATH=path/to/my/client.key

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused how these relate to "default"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could users get sniped by the default lack of isolation by using these bash commands or the commands below? (I know I personally have.) Would rather see a file whose impacts are scoped by default rather than the equivalent of global variables.

Copy link
Member Author

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused how these relate to "default"

As @josh-berry requested, I will add a section clarifying how configuration is read from environment variables and the file. In this case, TEMPORAL_ADDRESS is the env var equivalent of TEMPORAL_DEFAULT_ADDRESS which is the equivalent of setting the address for profile default which is the equivalent of providing --address on the CLI or setting it in the SDK client options. EDIT: Added section on configuration loaded.

Could users get sniped by the default lack of isolation by using these bash commands or the commands below? Would rather see a file whose impacts are scoped by default rather than the equivalent of global variables.

This is your choice as a user, and we even allow env vars to be disabled when loading in an SDK, but we can't just eschew environment variables for this use case because people may set them accidentally.


* Common format for declarative workers, schedules, etc. This is only for client options.
* File-based hierarchy. We don't need many levels of file merging for these options at this time.
* Extreme focus on human authoring of this format. This is because it'd likely go against the "0 new dependencies" rule.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, please reword this. What "extreme focus on human authoring of this format" means is in the eye of the beholder. Might be better to put something about human authoring features as a goal with a lower priority. Then we can align on degrees of importance rather than stark goal/non-goal status.

Comment on lines +93 to +96
temporal config set --profile dev --key address --value my-dev-ns.a1b2c.tmprl.cloud:7233
temporal config set --profile dev --key namespace --value my-dev-ns.a1b2c
temporal config set --profile dev --key tls.clientCertPath --value path/to/my/dev-cert.pem
temporal config set --profile dev --key tls.clientKeyPath --value path/to/my/dev-cert.key

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nicer if users weren't required to run imperative commands to do config for customers who lack such hooks. I.e. they could just declare the values statically and then the SDK can parse that.

I also see a lot of "it worked on my box" bugs with these sorts of imperative configs where somebody has already run script A and then runs script B, which works fine. Whereas another dev checks out the code and only runs script B and it doesn't work.

Copy link
Member Author

@cretz cretz Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nicer if users weren't required to run imperative commands to do config for customers who lack such hooks. I.e. they could just declare the values statically and then the SDK can parse that.

They aren't required. This is just configuration file editing, which they can do manually.

I also see a lot of "it worked on my box" bugs with these sorts of imperative configs where somebody has already run script A and then runs script B, which works fine. Whereas another dev checks out the code and only runs script B and it doesn't work.

Yes, if software that uses a configuration file works on one machine, you need the configuration file on the other machine. This is similar to how other SDKs like AWS work and just how software with configuration works (and one of the benefits of requiring opt-in on SDKs so it's clear). The key is that the configuration can change without the software changing. So "works on my box" with localhost can become "works on production box" with no code changes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the advantage of configuration files is that they simply need to be present--nothing needs to be explicitly run beforehand. Thus we eliminate a class of bugs where the script didn't get run.

Comment on lines 292 to 302
* Try to load profile from configuration file
* Only if file loading is enabled
* Can If user provides specific config file, use that, otherwise use the default location
* If user specifies the profile, use that, otherwise use the default
* Try to overwrite with specific configuration environment variables
* Only if env loading is enabled
* If user specifies the profile, use that, otherwise use the default
* If the `default` profile is being loaded, try the profile-specific environment variable first, e.g.
`TEMPORAL_DEFAULT_ADDRESS` is used over `TEMPORAL_ADDRESS` if it's present.
* The `default` env vars are only for the default profile, meaning a profile of `foo` cannot use `TEMPORAL_ADDRESS`,
only `TEMPORAL_FOO_ADDRESS` if present.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor clarity cleanups:

Suggested change
* Try to load profile from configuration file
* Only if file loading is enabled
* Can If user provides specific config file, use that, otherwise use the default location
* If user specifies the profile, use that, otherwise use the default
* Try to overwrite with specific configuration environment variables
* Only if env loading is enabled
* If user specifies the profile, use that, otherwise use the default
* If the `default` profile is being loaded, try the profile-specific environment variable first, e.g.
`TEMPORAL_DEFAULT_ADDRESS` is used over `TEMPORAL_ADDRESS` if it's present.
* The `default` env vars are only for the default profile, meaning a profile of `foo` cannot use `TEMPORAL_ADDRESS`,
only `TEMPORAL_FOO_ADDRESS` if present.
1. Try to load profile from configuration file
* Only if file loading is enabled
* If user provides specific config file, use that, otherwise use the default location
* If user specifies the profile, use that, otherwise use the `default` profile
2. Try to overwrite with specific configuration environment variables
* Only if env loading is enabled
* If user specifies the profile, use that, otherwise use the `default` profile
* If the `default` profile is being loaded, try the profile-specific environment variable first, e.g.
`TEMPORAL_DEFAULT_ADDRESS` is used over `TEMPORAL_ADDRESS` if it's present.
* The `default` env vars are only for the default profile, meaning a profile of `foo` cannot use `TEMPORAL_ADDRESS`,
only `TEMPORAL_FOO_ADDRESS` if present.
4. Applications can further override config values if they want. For example, the CLI will override config values based on flags passed into the command.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will incorporate on my next commit

* Common on-disk location in every platform for reading/writing this format
* Ability to provide environment variables to represent values in the configuration
* Well-specified hierarchy for on-disk overridable by environment variables overridable by in-CLI/SDK values
* 0 new dependencies in SDKs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not want optional "you can use this configuration format" extensions; I think that would be an unnecessary scope expansion and not good value for the complexity required.

As much as I hate JSON as a config format, I have to agree with Chad that avoiding new dependencies, and avoiding optional extensions for something this small, is probably the correct call so long as we continue to support people who are opinionated enough to roll their own.

* Ability to provide environment variables to represent values in the configuration
* Well-specified hierarchy for on-disk overridable by environment variables overridable by in-CLI/SDK values
* 0 new dependencies in SDKs
* Profile names within the configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only value I can think of for having them is in the CLI itself, where it's convenient to be able to refer to different environments by short names rather than full config file paths. But I'm not sure if that use case needs to bleed into the SDKs.

CLI and SDKs need to share the same approach here.

Why?

The CLI has some use cases/scenarios that don't apply to the (other) SDKs, like being used for operational/admin tasks, possibly in multiple environments.

Your example below of downloading a premade config from the cloud illustrates this point—profiles don't help you here

That they don't help you there doesn't mean they don't help you anywhere. That's not a multi-profile use case. Having every profile be a different file I think is a bit rough for multi-profile use cases and sharing of multi-profile configuration. EDIT: I added a clarification on why multi-profile-single-file.

I am struggling to come up with a "multi-profile use case" that can't also easily be handled with multi-file. Can you provide an example?

Now running the same code again unmodified will run against cloud. Or in a Temporal future with API keys and a
known/fixed API endpoint:

temporal config set --key cloudApiKey --value my-api-key

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not following what you're saying here. Are you saying that setting cloudApiKey (or any other cloud-specific setting) would hypothetically auto-magically say "you're running against cloud now" and change other config values (e.g. apiKey) automatically? Because as a user, that would be extremely surprising to me—I hate it when libraries do that.

Now, having a special option like cloud: true or something, which is just a (well-documented) synonym for {tls: true, address: "cloud.temporal.io"} or whatever would make much more sense (although I personally still wouldn't use it). Just don't give me cloud and non-cloud variants of something like apiKey, because this is a distinction without a difference and will create a ton of user confusion.

#### TypeScript SDK Idea

* New `client` module function `loadConfig(options?)` where the `options` type is
`{ configFile?: string, disableFile?: bool, disableEnv?: bool }` and it returns a `proto.Config`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the TypeScript APIs and I see no disableFoo identifiers anywhere in the API docs—to the contrary, I see things like reuseV8Context which are optional and default to true. Not sure about other languages, but to align with existing conventions in TypeScript, you do need to avoid disableX nomenclature.

work in some SDKs. The overall approach is to just make client and connection options loadable from configuration
without overthinking shortcuts or merging or hierarchies.

#### Go SDK Idea

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree this is expanding scope—you have just said we already have 4/5 of the things I'm asking for but presented in a different (IMO less composable) form. And the 5th, mergeConfig(), is something we will have to implement internally anyway if we want consistent behavior when an env variable overrides a file, or when a file overrides built-in defaults. It just needs to be exposed.

all-sdk/external-client-configuration.md Outdated Show resolved Hide resolved
Comment on lines +196 to +197
* We want the tools that mutate this file to have strong validation on the keys being added, so hopefully this
avoids the concern of key typos.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the consequences of a silent failure (e.g. due to mistyped key), and are we sure that we're not going to add some key in future that might have significant consequences if missing / if a default is inadvertently used?

Depends on the key. It's hard to speculate what an older SDK not recognizing a newer config key is because we can't predict what future keys will be added here.

Right, so we can't know if an unknown key is safe to ignore or not, which makes me think that strict validation might actually be quite important. I will preemptively say that I understand we can't protect users against every misconfiguration, but protecting against a mis-typed field name, or recognizing and flagging some unknown config that might be critical, both feel like pretty common issues that can easily be surfaced/avoided.

Also, adding strict validation and relaxing it later is much easier than skipping it now and adding it later. We're much less likely to get complaints about error messages that don't happen.

Also, we should consider providing optional strict validation even if we decide not to do it by default, so users can at least discover problems in a straightforward way.

This adds surface area to loading APIs. Can you suggest the APIs for how this looks? Should we add an option to every way that a config is loaded that fails on unrecognized key just for this use case? Are we sure this feature is worth it on MVP?

Sure, for each LoadConfig() function, accept an optional strict flag that defaults to true (or DisableStrict flag that defaults to false, if you must, for languages that don't support defaulting to true). Though I think I may have talked myself into adding strict validation as part of the MVP, and removing it later if users complain, in which case no new API surface is necessary.

Comment on lines +206 to +207
* Existing CLI used `~/.config/temporalio/temporal.yaml`
* ❓ Is this confusing for them? Are we concerned about them vs what is best moving forward?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I have no opinion on what form the migration should take, merely that it should exist.


### Non-goals

* Common format for declarative workers, schedules, etc. This is only for client options.
Copy link

@dandavison dandavison Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go slowly here. In the future, if we want to support declarative configuration of other aspects of Temporal, then we will not want multiple config files. The natural design decision at this point is to say

We're currently designing for client configuration only. But the design we've chosen allows the possibility that other top-level keys will be added in the future.

In which case client config should be under its own top-level key, allowing other top-level keys in the future for all the things we're deliberately not designing or thinking about today: schedules, etc

As long as we place the client config behind a top-level client key, then we can use temporal config ... and TEMPORAL_CONFIG etc below. This is what we want.

But if we were to make the config file client-specific, then these would need to be temporal config --client or temporal client-config, and TEMPORAL_CLIENT_CONFIG, etc. There's no reason to inflict that pain upon our users and ourselves: the top-level key in the config file has no cost (especially since users are encouraged to edit the file automatically) and it has great benefits (the config file can grow to support whatever file-based config we decide to add to the product in the future).

Copy link
Member Author

@cretz cretz Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think declarative resources should be tackled separately. Having said that, we can probably add non-client things in the future. But I'm not sure we need to qualify client things. The reason this is CLI/SDK configuration and not client configuration is because you need these things to basically do anything with the CLI/SDK. From a user POV it's just "config". If we want to add things that do not apply to clients (e.g. worker configuration) that may make sense, but I still think these general options can be top level.

I think asking all users to add this client. prefix for our own personal future proofing is a bit rough. IMO it is a worse developer experience to make users use TEMPORAL_CLIENT_ADDRESS and client.address and what not over TEMPORAL_ADDRESS and address. These latter have been in use for years already. I don't think we should harm developer experience here over a perceived future that may not come about.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think declarative resources should be tackled separately. Having said that, we can probably add non-client things in the future.

The proposed file name is <app-config-dir>/temporalio/temporal.json, so this file clearly must include all temporal config. And that's a good thing! We don't want multiple files.

But I'm not sure we need to qualify client things.

In the proposed schema, all the keys related to client config are already nested below profiles, and name. There are currently 6 of those keys: address, namespace, api_key, tls, codec, grpcMeta.

But 6 is just the number in the first iteration of the proposal; it's bound to grow over time, and the result will be that the other sections we add to the file will be their own root nodes, swimming oddly among a sea of client config details. If we look at file formats such as TOML and TOML-ish things like ~/.gitconfig, there are no top-level keys.

I can only see advantage and no disadvantage in grouping the client config under its own key (e.g. clientConfig), especially since we're already two levels in. Can you list the disadvantages you see with that?

Copy link
Member Author

@cretz cretz Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be that the other sections we add to the file will be their own root nodes, swimming oddly among a sea of client config details.

But there are no other sections today. I don't agree we should harm developer experience today for some may-never-happen unrealized future.

Can you list the disadvantages you see with that?

I mentioned this in the last paragraph above:

I think asking all users to add this client. prefix for our own personal future proofing is a bit rough. IMO it is a worse developer experience to make users use TEMPORAL_CLIENT_ADDRESS and client.address and what not over TEMPORAL_ADDRESS and address. These latter have been in use for years already. I don't think we should harm developer experience here over a perceived future that may not come about.

I don't want everyone to qualify these things as client-specific config just because there is technically the possibility of some non-client future.

Looking at a previous comment, I think we have different ideas on the purposes of this configuration. This is environment/endpoint-and-auth-like configuration only. It is not code configuration, it is not meant to be committed, it is only for things that may change from one environment to the next. There are dozens of "options" that Temporal accepts for all sorts of things, but these are not what is meant by this environment configuration. I can try to clarify this in the document.


### Future Goals

None identified at this time. The goals are simple enough for the MVP and the non-goals are ones we never want.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the file is going to be named .../temporal.toml or .../temporal.json, let's add

  • Support non-client connection configuration

to the "Future Goals" section. It's important that this be future-proof: we don't want a future where a second Temporal config file needs to be introduced.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add that, sure. Granted I can't think of any non-client configuration at this time, but that's fine.

* Ability to provide environment variables to represent values in the configuration
* Well-specified hierarchy for on-disk overridable by environment variables overridable by in-CLI/SDK values
* 0 new dependencies in SDKs
* Profile names within the configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's classify the contexts in which the config file will be used into deployment contexts (whether prod or staging environments) vs development contexts.

In deployment contexts, a single file with a single profile will probably be what's wanted, with client connection secrets injected via env vars.

Regarding development contexts, consider an organization with a large monorepo containing many different projects that make use of Temporal. The different projects in the monorepo will want the ability to commit independent Temporal config files in their own subdirectories of the monorepo. In this context, the single XDG_CONFIG system config location will not be used, so projects are going to make use of a TEMPORAL_CONFIG env var or other mechanisms to specify their config file location.

So in such a codebase, multiple config files is an inevitable reality. Some questions are:

  1. Will they want upwards file-system traversal and/or hierarchical merge functionality? Perhaps, but we can add that later if we think it makes sense.

  2. Will they want multiple profiles within a single file? They're already using mechanisms to specify the location of their config file, so it's very possible that the answer is "no, they can use different files if they need that".

Open-source projects such as git and uv have chosen:

  • not to support multiple profiles in a single file
  • yes to support upwards file-system traversal to discover the applicable config file
  • yes to support multiple file locations with hierarchical merge.
  • yes to TOML

I think you're a bit too hung up on files here, files are just one way bytes are stored. Configuration is loaded from bytes.

Files are how developers look at profiles: cat /path/to/my/config. Multiple profiles within a single file are hard to read. AWS have solved that by using a configuration language that supports references, but we're not currently considering that option. Multiple profiles in JSON will be very hard to read, and it's well-known that no-one can remember how to use jq :)

this as much as possible.
* ❓ Can we just have an "extension" that has new dependencies?
* Yes, but ideally users don't have to add dependencies on extension libraries just to get this functionality.
* Profile names within the configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Profile names within the configuration

This is the Goals section, but that should go in the Proposal section. The Goal should say what functional requirement we want to achieve: i.e. what experience/use cases we are aiming to enable for the user (or what bad experience we are aiming to avoid). For example "An SDK or CLI-based project can select one of multiple profiles".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is "single file configuration", I just worded it differently here

Comment on lines +68 to +69
* Because YAML and TOML and others require a dependency in our SDKs. AWS for example has a homegrown INI-ish
implementation they have had to build in to every language.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suggested above, we should consider vendoring also, so it would be up to the SDK/CLI whether they wish to vendor or use a dependency.

To make it concrete, here's an example of what it might look like in TOML. Not supporting multiple profiles, since whether or not we want that is a separate question that we're discussing. As I've said above, I think there's a strong argument for single-profile per file.

[client]
address = "localhost:7233"
namespace = "default"
api_key = "<my-key>"
codec = { auth = "<my-headers>", endpoint = "https://my-codec-server.com" }

[client.tls]
client_cert_path = "/path/to/certs/client/cert.pem"
client_key_path = "/path/to/certs/client/key.pem"
server_ca_cert_path = "/path/to/certs/ca/cert.pem"

[client.grpc_meta]
my_grpc_header_1 = "some-value"
my_grpc_header_2 = "some-value"

* `clientKeyPath` - File path to mTLS client key. Mutually exclusive with `clientKeyData`.
* `clientKeyData` - Key data. Mutually exclusive with `clientKeyPath`.
* `serverCaCertPath` - File path to server CA cert. Mutually exclusive with `serverCaCertData`.
* `serverCaCertData` - CA cert data. Mutually exclusive with `serverCaCertPath`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If hyphens don't work well then let's consider underscores. This is of course subjective and up to preference, but it's a design decision we have to make. I don't usually see camel case in config files -- can you give examples?

Yes, this is proper JSON formatting of key names.

Not sure what you meant by "proper".

* Use external configuration by default in SDKs (it is default in CLI and samples). Users need to opt-in to this for
compatibility and clarity reasons (but it is of course easy).

### High-level Usage Examples
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add: a more obvious example of using a config file and configuration for vscode extension

@cretz
Copy link
Member Author

cretz commented Sep 5, 2024

Closing in favor of a new round

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants