-
Notifications
You must be signed in to change notification settings - Fork 84
TypeSpec to Java
DPG 2.0 requires TypeSpec as input, if service would like to generate models.
Part of the reason is that TypeSpec supports versioning. It is hard to support from OpenAPI, or from OpenAPI generated from TypeSpec.
Resources:
The Data-plane in TypeSpec would be used for validation during development, until we had knowledge of the first real TypeSpec for SDK release.
AutoRest CLI currently does not support the pipeline from TypeSpec to code generator.
TypeSpec Java is integrated as plugin to TypeSpec compiler.
The Java.emitter and the JAR of the code generator is packed into a single NPM package.
- Java.emitter first communicates with TypeSpec compiler/rest/versioning, to generate a
code-model.yamlfor code generator. - Java.emitter then executes the JAR, with necessary information.
- JAR of the code generator parses the
code-model.yamland generates Java code.
AutoRest
flowchart LR
Swagger-->m4
m4-->preprocessor
preprocessor-->javagen
javagen-->postprocessor
postprocessor-->Java.SDK
preprocessor-->androidgen
androidgen-->Android.SDK
TypeSpec
flowchart LR
TypeSpec-->Java.emitter-- yaml -->preprocessor
subgraph JAR
preprocessor-->javagen
end
javagen-->Java.SDK
Source:
The code-model.yaml is compatible with output of current Modeler Four.
It will be enhanced for TypeSpec features.
Candidates of enhancement:
- Summary on each type, operation, property
- Namespace on type (if different from the global namespace)
- Versioning information (
addedOn,removedOn,renamedFrom,madeOptional)
preprocessor and javagen is packaged together in one JAR to form the code generator.
postprocessor is temporary left out. But it can be included without much effort.
Log is written to stdout, and it is connected to Java.emitter.
Files are directly written to file system.
language:
default:
name: Confidential Ledger Service
description: ''
namespace: Azure.Security.ConfidentialLedger
java:
namespace: com.azure.security.confidentialledgerCode generator will do further processing, like replace Azure.Core.Operation.Error with com.azure.core.models.ResponseError.
Literal type (StringLiteralType, NumericLiteralType, BooleanLiteralType) maps to Constant.
Union (UnionType) of literal type maps to Enum.
Enum (EnumType) maps to ExpandableStringEnum.
Enum with @fixed decorator maps to Enum.
It maps to ExpandableStringEnum.
Union of int64 | null maps to Long (object), while Model int64 maps to long (primitive).
This difference only applies to Java primitive data types. There is no difference to Java object data type, as it is always nullable.
Nullable could be handled differently in Patch model for "application/merge-patch+json".
foo?: string = "bar" maps to optional parameter in API or optional property in model.
The default value is for service (when the parameter or property is not provided, service takes that value), SDK does not use it.
Union is supported as input.
input: string | string[] maps to classes
public abstract class InputModelBase {
protected InputModelBase()
}
@Immutable
public final class StringInputModel extends InputModelBase {
public StringInputModel(String value)
@JsonValue public String getValue()
}
@Immutable
public final class StringListInputModel extends InputModelBase {
public StringListInputModel(List<String> value)
@JsonValue public List<String> getValue()
}If a property has @visibility decorator but without input context in it, it is read-only.
If there is no parameter, SDK uses the url of the @server as host, similar to host in OpenAPI.
If there are parameters, SDK takes the parameters to populate the host (url would then be e.g. https://{region}.foo.com as template), similar to x-ms-parameterized-host.
If there is no @server, SDK fallback to takes a single {endpoint} parameter as host.
All these parameters are treated as client parameters.
Multiple @server (to namespace) is supported. Different server would have to be on different client.
Multiple api-versions map to multiple enum value in ServiceVersion class. Last api-version is treated as latest.
public enum FooServiceVersion implements ServiceVersion {
V2022_06_01_PREVIEW("2022-06-01-preview"),
V2022_12_01_PREVIEW("2022-12-01-preview");
}One can use service-name emitter option to change the name of the class.
Different versions for different client is supported as preview feature. It would result in one ServiceVersion per client.
Service is recommended to use op ResourceList<> from @azure-tools/typespec-azure-core.
Method signature:
PagedFlux<BinaryData> list(...)
PagedIterable<BinaryData> list(...)@useAuth(OAuth2Auth<[AuthFlow]> | ApiKeyAuth<ApiKeyLocation.header, "x-ms-api-key">)
namespace ...;
model AuthFlow {
type: OAuth2FlowType.clientCredentials;
tokenUrl: "https://api.example.com/oauth2/token";
refreshUrl: "https://api.example.com/oauth2/refresh";
scopes: [
"https://api.example.com/.default"
]
}Only OAuth2 (with scopes) and ApiKey (with header) is supported.
They produce trait TokenCredentialTrait and AzureKeyCredentialTrait in builder, respectively.
PUT method is usually defined as ResourceCreateOrReplace<>, for example:
op createOrUpdate is ResourceCreateOrReplace<Project>;The model of request body is ResourceCreateOrReplaceModel<TResource>, which passes multiple templates/decorators.
Hence, its definition is no longer same as TResource.
SDK is still required to have same model for request body and response body, for example:
Project createOrUpdate(String projectName, Project project);In design.
Convenience API is not generated for JSON Merge Patch.
Service is recommended to use op LongRunningResourceCreateOrReplace<> etc. from @azure-tools/typespec-azure-core.
At present, emitter recognizes @pollingOperation decorator on operation (for now, also @pollingLocation decorator in response headers).
Method signature:
PollerFlux<BinaryData, BinaryData> beginCreateOrUpdate(...)
SyncPoller<BinaryData, BinaryData> beginCreateOrUpdate(...)Convenience API takes the response type of @pollingOperation API as poll response type, and the response type of @finalOperation API as final result type.
If no @finalOperation, it would deduce the final result type as response type of this LRO API (actually the activation API), which could be incorrect.
SDK uses exception classes from azure-core, e.g. HttpResponseException, ClientAuthenticationException, ResourceNotFoundException, ResourceModifiedException.
TypeSpec does not yet able to specify that a particular status code as expected or not. Therefore, at present, any status code same or larger and 400 is treated as unexpected.
Service uses decorator @convenientAPI from @azure-tools/typespec-client-generator-core.
The operation would have.
convenienceApi:
language:
default:
name: <convenience-api-name>And all related models (object and enum) would be annotated with usage.
usage:
- convenience-apiAnd only those models having convenience-api in usage would be generated as Java file.
Model used as response body of pageable operation is generated in implementation/models package, as the class does not need to be accessed by user.
Options to the cadl-java can be specified in tspconfig.yaml.
For instance:
emit:
- "@azure-tools/cadl-java"
options:
"@azure-tools/cadl-java":
emitter-output-dir: "{project-root}/azure-ai-language-authoring"
namespace: "com.azure.ai.language.authoring"
service-name: "Authoring"
partial-update: false
service-versions:
- "2022-05-15-preview"
namer: false
generate-samples: true
generate-tests: trueA few dev options are reserved for developer:
dev-options:
generate-code-model: trueService uses decorator @client and @operationGroup from @azure-tools/typespec-client-generator-core.
As CADL compiler is Node.js, and code generator is Java, some kind of IPC is required.
Candidates (brainstorm):
- IPC supported by CADL package
- A daemon service for IPC (e.g. Codegen calls
getAllRoutesvia REST API, the daemon call same to CADL compiler, then send response to Codegen as JSON) - Compile both to binary, e.g. WebAssembly or GraalVM
- Java runs JavaScript engine, e.g. J2V8
A standard flow without much advanced tech stack would be (which is what Python does),
flowchart LR
CADL.compiler-->Java.emitter-- yaml -->Codegen-->Java
The yaml is the intermediate data for communication between CADL compiler and code generator.
It is in the format of internal ClientModel of the code generator.
The Java.emitter is a TypeScript library that interact with CADL compiler and output the yaml.
Sample:
Design and improvements (brainstorm):
- Limit the code of Java.emitter which is in TypeScript, as we are Java developer. But it might still be covering what we had for preprocess module and mapper package in javagen module.
- Should we use YAML or JSON? The difference is that snakeyaml in Java is not easy to use, but YAML supports anchor and reference natively.
- Should we directly aim for ClientModel, or some data format more aligned with CodeModel from Modeler Four.
- Should we generate the essential part of the ClientModel, and let code generator to fill the rest. E.g. only include
ProxyMethodin YAML, getClientMethodgenerated from it; only includeServiceClientin YAML, getClientBuildergenerated from it. - One difficulty is that the class initialized by snakeyaml is not compatible with existing Builder patten. In PoC the walkaround is many additional setter methods.
Current state:
- Builder pattern (and immutability of basic ClientModel objects) is a major source of incompatibility with YAML.
- Singleton pattern (e.g. single
ClassType.UserDefinedModelasITypefor single model) and multiple references (e.g.ProxyMethodreferenced fromProxyandClientMethod) is a major source of incompatibility with JSON, which does not support anchor and reference (see*ref_in the YAML). - Duplication (e.g. lots of
ClientMethodto a single request in operation) is manageable issue. - Some code in Mapper would need to be re-write in TypeScript, or in Java but based on ClientModel.
The CodeModel from Modeler Four is much easier to analyze and manipulate than what we have now in ClientModel. For example, management-plane does lots of analysis and modification based on CodeModel.
On the contrary, ClientModel has more duplication in its data representation. E.g. data about model could be in IType, ClientModel, and maybe in other classes that have reference to the model.
A few DPG features, like selectively generate models for operations, would require analyzing the operation and the models used in its parameter and response, and then the hierarchy/reference of the models. We might either put the logic in TypeScript, or make ClientModel easy to analyze.
Another direction to explore, with standard flow, is to let Java.emitter output a simplified version of CodeModel.
One advantage in development is that this almost completely de-couples work on TypeScript and work on Java. Work on TS would focus on generating a correct CodeModel from CADL. And work on Java would focus on consuming data from existing swagger for down-stream development, and later switch to CADL when the emitter is completed and tested.
In the long term, a language-agnostic domain specific data format (as the CodeModel and its evolution) helps developer think on what is the essential information we need to pass from CADL to code generator, not what Java code needs.
For example, when encounter removedOn decorator, we might be tempted to jump in and think about method overload or model de-serialization in Java.
An apparent drawback is that CADL is already language-agnostic, and there is no better representation than CADL itself. However, if we need data exchange between Typescript and Java, CodeModel might still be the right compromise on what we are familiar (and known to work), and what is optimal (as we cannot output CADL itself).
Another drawback is that having another abstraction layer could have some cost on the speed of design and implementation. If we need to support a new feature from CADL, we had to think about how we represent it in language-agnostic way, and how to transform it to ClientModel which is best for Java code.
Another thought is to make the Java.emitter a daemon providing RPC for Codegen. (this one likely not going to fit in the Aug schedule)
When code generator calls getAllRoutes (may route to /localhost/getAllRoutes), the emitter in TypeScript would in turn call getAllRoutes from @cadl-lang/rest and reply the response as JSON.
This way, code generator almost directly works with CADL compiler, and the JSON in the response serves as intermediate data. There is no need to have any other language-agnostic domain specific data format.
There is a lot to verify on this approach.
- Is the response of all CADL API representable in JSON?
- How does code generator handle the raw JSON? Do we still use a model in Java to de-serialize it? Does it affect the feasibility of evolution if CADL decide to change the response data?
Current state:
- Response of
getAllRoutescannot be serialized to JSON, due to circular reference.