-
Notifications
You must be signed in to change notification settings - Fork 82
TypeSpec to Java
DPG 2.0 requires TypeSpec as input, if service would like to generate models.
Part of the reason is that TypeSpec supports versioning. It is hard to support from OpenAPI, or from OpenAPI generated from TypeSpec.
Resources:
The Data-plane in TypeSpec would be used for validation during development, until we had knowledge of the first real TypeSpec for SDK release.
AutoRest CLI currently does not support the pipeline from TypeSpec to code generator.
TypeSpec Java is integrated as plugin to TypeSpec compiler.
The Java.emitter and the JAR of the code generator is packed into a single NPM package.
- Java.emitter first communicates with TypeSpec compiler/rest/versioning, to generate a
code-model.yaml
for code generator. - Java.emitter then executes the JAR, with necessary information.
- JAR of the code generator parses the
code-model.yaml
and generates Java code.
AutoRest
flowchart LR
Swagger-->m4
m4-->preprocessor
preprocessor-->javagen
javagen-->postprocessor
postprocessor-->Java.SDK
preprocessor-->androidgen
androidgen-->Android.SDK
TypeSpec
flowchart LR
TypeSpec-->Java.emitter-- yaml -->preprocessor
subgraph JAR
preprocessor-->javagen
end
javagen-->Java.SDK
Source:
The code-model.yaml
is compatible with output of current Modeler Four.
It will be enhanced for TypeSpec features.
Candidates of enhancement:
- Summary on each type, operation, property
- Namespace on type (if different from the global namespace)
- Versioning information (
addedOn
,removedOn
,renamedFrom
,madeOptional
)
preprocessor
and javagen
is packaged together in one JAR to form the code generator.
postprocessor
is temporary left out. But it can be included without much effort.
Log is written to stdout
, and it is connected to Java.emitter.
Files are directly written to file system.
language:
default:
name: Confidential Ledger Service
description: ''
namespace: Azure.Security.ConfidentialLedger
java:
namespace: com.azure.security.confidentialledger
Code generator will do further processing, like replace Azure.Core.Operation.Error
with com.azure.core.models.ResponseError.
Literal type (StringLiteralType
, NumericLiteralType
, BooleanLiteralType
) maps to Constant.
Union (UnionType
) of literal type maps to Enum
.
Enum (EnumType
) maps to ExpandableStringEnum
.
Enum with @fixed
decorator maps to Enum
.
It maps to ExpandableStringEnum
.
Union of int64 | null
maps to Long
(object), while Model int64
maps to long
(primitive).
This difference only applies to Java primitive data types. There is no difference to Java object data type, as it is always nullable.
Nullable could be handled differently in Patch model for "application/merge-patch+json".
foo?: string = "bar"
maps to optional parameter in API or optional property in model.
The default value is for service (when the parameter or property is not provided, service takes that value), SDK does not use it.
Union is supported as input.
input: string | string[]
maps to classes
public abstract class InputModelBase {
protected InputModelBase()
}
@Immutable
public final class StringInputModel extends InputModelBase {
public StringInputModel(String value)
@JsonValue public String getValue()
}
@Immutable
public final class StringListInputModel extends InputModelBase {
public StringListInputModel(List<String> value)
@JsonValue public List<String> getValue()
}
If a property has @visibility
decorator but without input context in it, it is read-only.
If there is no parameter, SDK uses the url
of the @server
as host, similar to host
in OpenAPI.
If there are parameters, SDK takes the parameters to populate the host (url
would then be e.g. https://{region}.foo.com
as template), similar to x-ms-parameterized-host
.
If there is no @server
, SDK fallback to takes a single {endpoint}
parameter as host.
All these parameters are treated as client parameters.
Multiple @server
(to namespace) is supported. Different server would have to be on different client.
Multiple api-versions map to multiple enum value in ServiceVersion
class. Last api-version is treated as latest
.
public enum FooServiceVersion implements ServiceVersion {
V2022_06_01_PREVIEW("2022-06-01-preview"),
V2022_12_01_PREVIEW("2022-12-01-preview");
}
One can use service-name
emitter option to change the name of the class.
Different versions for different client is supported as preview feature. It would result in one ServiceVersion
per client.
Service is recommended to use op ResourceList<>
from @azure-tools/typespec-azure-core.
Method signature:
PagedFlux<BinaryData> list(...)
PagedIterable<BinaryData> list(...)
@useAuth(OAuth2Auth<[AuthFlow]> | ApiKeyAuth<ApiKeyLocation.header, "x-ms-api-key">)
namespace ...;
model AuthFlow {
type: OAuth2FlowType.clientCredentials;
tokenUrl: "https://api.example.com/oauth2/token";
refreshUrl: "https://api.example.com/oauth2/refresh";
scopes: [
"https://api.example.com/.default"
]
}
Only OAuth2 (with scopes) and ApiKey (with header) is supported.
They produce trait TokenCredentialTrait
and AzureKeyCredentialTrait
in builder, respectively.
PUT method is usually defined as ResourceCreateOrReplace<>
, for example:
op createOrUpdate is ResourceCreateOrReplace<Project>;
The model of request body is ResourceCreateOrReplaceModel<TResource>
, which passes multiple templates/decorators.
Hence, its definition is no longer same as TResource
.
SDK is still required to have same model for request body and response body, for example:
Project createOrUpdate(String projectName, Project project);
In design.
Convenience API is not generated for JSON Merge Patch.
Service is recommended to use op LongRunningResourceCreateOrReplace<>
etc. from @azure-tools/typespec-azure-core.
At present, emitter recognizes @pollingOperation
decorator on operation (for now, also @pollingLocation
decorator in response headers).
Method signature:
PollerFlux<BinaryData, BinaryData> beginCreateOrUpdate(...)
SyncPoller<BinaryData, BinaryData> beginCreateOrUpdate(...)
Convenience API takes the response type of @pollingOperation
API as poll response type, and the response type of @finalOperation
API as final result type.
If no @finalOperation
, it would deduce the final result type as response type of this LRO API (actually the activation API), which could be incorrect.
SDK uses exception classes from azure-core, e.g. HttpResponseException
, ClientAuthenticationException
, ResourceNotFoundException
, ResourceModifiedException
.
TypeSpec does not yet able to specify that a particular status code as expected or not. Therefore, at present, any status code same or larger and 400 is treated as unexpected.
Service uses decorator @convenientAPI
from @azure-tools/typespec-client-generator-core.
The operation would have.
convenienceApi:
language:
default:
name: <convenience-api-name>
And all related models (object and enum) would be annotated with usage.
usage:
- convenience-api
And only those models having convenience-api
in usage
would be generated as Java file.
Model used as response body of pageable operation is generated in implementation/models
package, as the class does not need to be accessed by user.
Options to the cadl-java can be specified in tspconfig.yaml
.
For instance:
emit:
- "@azure-tools/cadl-java"
options:
"@azure-tools/cadl-java":
emitter-output-dir: "{project-root}/azure-ai-language-authoring"
namespace: "com.azure.ai.language.authoring"
service-name: "Authoring"
partial-update: false
service-versions:
- "2022-05-15-preview"
namer: false
generate-samples: true
generate-tests: true
A few dev options are reserved for developer:
dev-options:
generate-code-model: true
Service uses decorator @client
and @operationGroup
from @azure-tools/typespec-client-generator-core.
As CADL compiler is Node.js, and code generator is Java, some kind of IPC is required.
Candidates (brainstorm):
- IPC supported by CADL package
- A daemon service for IPC (e.g. Codegen calls
getAllRoutes
via REST API, the daemon call same to CADL compiler, then send response to Codegen as JSON) - Compile both to binary, e.g. WebAssembly or GraalVM
- Java runs JavaScript engine, e.g. J2V8
A standard flow without much advanced tech stack would be (which is what Python does),
flowchart LR
CADL.compiler-->Java.emitter-- yaml -->Codegen-->Java
The yaml
is the intermediate data for communication between CADL compiler and code generator.
It is in the format of internal ClientModel of the code generator.
The Java.emitter
is a TypeScript library that interact with CADL compiler and output the yaml.
Sample:
Design and improvements (brainstorm):
- Limit the code of Java.emitter which is in TypeScript, as we are Java developer. But it might still be covering what we had for preprocess module and mapper package in javagen module.
- Should we use YAML or JSON? The difference is that snakeyaml in Java is not easy to use, but YAML supports anchor and reference natively.
- Should we directly aim for ClientModel, or some data format more aligned with CodeModel from Modeler Four.
- Should we generate the essential part of the ClientModel, and let code generator to fill the rest. E.g. only include
ProxyMethod
in YAML, getClientMethod
generated from it; only includeServiceClient
in YAML, getClientBuilder
generated from it. - One difficulty is that the class initialized by snakeyaml is not compatible with existing Builder patten. In PoC the walkaround is many additional setter methods.
Current state:
- Builder pattern (and immutability of basic ClientModel objects) is a major source of incompatibility with YAML.
- Singleton pattern (e.g. single
ClassType.UserDefinedModel
asIType
for single model) and multiple references (e.g.ProxyMethod
referenced fromProxy
andClientMethod
) is a major source of incompatibility with JSON, which does not support anchor and reference (see*ref_
in the YAML). - Duplication (e.g. lots of
ClientMethod
to a single request in operation) is manageable issue. - Some code in Mapper would need to be re-write in TypeScript, or in Java but based on ClientModel.
The CodeModel from Modeler Four is much easier to analyze and manipulate than what we have now in ClientModel. For example, management-plane does lots of analysis and modification based on CodeModel.
On the contrary, ClientModel has more duplication in its data representation. E.g. data about model could be in IType
, ClientModel
, and maybe in other classes that have reference to the model.
A few DPG features, like selectively generate models for operations, would require analyzing the operation and the models used in its parameter and response, and then the hierarchy/reference of the models. We might either put the logic in TypeScript, or make ClientModel easy to analyze.
Another direction to explore, with standard flow, is to let Java.emitter
output a simplified version of CodeModel.
One advantage in development is that this almost completely de-couples work on TypeScript and work on Java. Work on TS would focus on generating a correct CodeModel from CADL. And work on Java would focus on consuming data from existing swagger for down-stream development, and later switch to CADL when the emitter is completed and tested.
In the long term, a language-agnostic domain specific data format (as the CodeModel and its evolution) helps developer think on what is the essential information we need to pass from CADL to code generator, not what Java code needs.
For example, when encounter removedOn
decorator, we might be tempted to jump in and think about method overload or model de-serialization in Java.
An apparent drawback is that CADL is already language-agnostic, and there is no better representation than CADL itself. However, if we need data exchange between Typescript and Java, CodeModel might still be the right compromise on what we are familiar (and known to work), and what is optimal (as we cannot output CADL itself).
Another drawback is that having another abstraction layer could have some cost on the speed of design and implementation. If we need to support a new feature from CADL, we had to think about how we represent it in language-agnostic way, and how to transform it to ClientModel which is best for Java code.
Another thought is to make the Java.emitter a daemon providing RPC for Codegen. (this one likely not going to fit in the Aug schedule)
When code generator calls getAllRoutes
(may route to /localhost/getAllRoutes
), the emitter in TypeScript would in turn call getAllRoutes
from @cadl-lang/rest
and reply the response as JSON.
This way, code generator almost directly works with CADL compiler, and the JSON in the response serves as intermediate data. There is no need to have any other language-agnostic domain specific data format.
There is a lot to verify on this approach.
- Is the response of all CADL API representable in JSON?
- How does code generator handle the raw JSON? Do we still use a model in Java to de-serialize it? Does it affect the feasibility of evolution if CADL decide to change the response data?
Current state:
- Response of
getAllRoutes
cannot be serialized to JSON, due to circular reference.