Translated with DeepL.com (free version)
In the process of using Kotlin Jupiter Kennel, I found that there is no 3D drawing library, and I can only use JS to draw data. We can only use JS to draw the data by using the HTML(...)
function to write JS, which is very inconvenient. So I wrote the kotlin-jupyter-js plugin to support %js
line magics. The core problem with the kotlin-jupyter-js
plugin is: compiling JS code into ASTs is supported in the JVM. The core problem with the kotlin-jupyter-js
plugin is that the JVM supports compiling JS code into ASTs.
My idea is to implement SWC's JVM binding to solve this problem, SWC itself provides Node binding, so JVM binding is not that difficult to implement. Moreover, SWC supports TS/JSX compilation, which allows kotlin-jupyter-js
to support typescript
and React
.
The SWC JVM binding implementation is divided into two parts: 1) compiling the SWC Rust code into a JNI dynamic library, and 2) the JVM side, which implements the configuration classes and AST classes.
SWC is for JS and only provides support for Node binding, we need to refer to Node binding to implement JVM binding.
SWC Node binding exposes API output and input parameters are JSON strings, in Node, JSON strings can be easily converted to objects, but in JVM, you need to declare the corresponding classes.
SWC provides WASM binding, you can encapsulate SWC based on WASM, the advantage is that you don't need to implement JNI binding, but you need to introduce WASM Runtime, so we don't consider it.
Compiling Rust into a JNI dynamic library requires the Rust JNI FFI, which is supported by using jni.
This library provides an easy way to bridge Rust and Java, see the official jni
example.
On the JVM side of the code.
class HelloWorld {
init {
System.loadLibrary("mylib");
}
external fun hello(input: String): String;
}
In Rust code it's just a matter of writing the glue code.
#[no_mangle]
pub extern "system" fn Java_HelloWorld_hello<'local>(mut env: JNIEnv<'local>, class: JClass<'local>, input: JString<'local>) -> jstring {
let input: String =
env.get_string(&input).expect("Couldn't get java string!").into();
// your business logic
let output = env.new_string(format!("Hello, {}!", input))
.expect("Couldn't create java string!");
output.into_raw()
}
Calling HelloWorld().hello("JNI")
through JNI will call the Rust code returning Hello, JNI!
.
The declaration of the bridge function in the above Rust code is quite long, you can use jni_fn to generate the bridge function declaration automatically by macro to simplify the declaration.
#[jni_fn("HelloWorld")]
pub fn hello<'local>(...) -> jstring
With jni
and jni_fn
we can compile Rust code into JNI dynamic libraries.
SWC Node binding offers the following methods.
- transform
- transform
- transformSync
- transformFile
- transformFileSync
- parse
- parse
- parseSync
- parseFile
- parseFileSync
- minify
- minify
- minifySync
- print
- printSync
SWC Node binding provides synchronous and asynchronous methods via napi. However, the JVM's FFI jni
doesn't only support asynchrony, so we only implement the synchronous APIs: transformSync
,transformFileSync
,parseSync
,parseFileSync
,minifySync
,printSync
.
Below is an example of pase_sync
to explain how to implement it.
SWC itself only considers Node binding.swc_core implements the logic of binding to Node, aggregating other SWC sub-package dependencies. NMP package @swc/core
also wraps swc_core
. We can't use the swc_core
library directly, we need to replace other SWC subpackage calls.
For example, Compiler
from swc_core
:
use swc_core::{
base::{
Compiler,
},
}
Needs to be changed to be introduced from swc.
use swc::Compiler;
All SWC-related dependencies after swc_core
conversion.
[dependencies]
# ...
swc = "0.270.25"
swc_common = "0.33.9"
swc_ecma_ast = { version ="0.110.10", features = ["serde-impl"] }
swc_ecma_transforms = "0.227.19"
swc_ecma_transforms_base = "0.135.11"
swc_ecma_visit = "0.96.10"
swc_ecma_codegen = "0.146.39"
# ...
Theoretically, what needs to be done is simple: replace all napi
related logic with jni
. We don't need to change how SWC implements the specific functionality.
See SWC - binding_core_node for the pase_sync
implementation [binding_core_node /src/parse.rs#L168](https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/bindings/binding_core_node/ src/parse.rs#L168), parseSync
copies most of the logic directly, but requires changes to the handling of incoming and outgoing parameters.
The pase_sync
implementation of binding_core_node
:
#[napi]
pub fn parse_sync(src: String, opts: Buffer, filename: Option<String>) -> napi::Result<String> {
// ...
Ok(serde_json::to_string(&program)?)
}
Signature changes and entry/exit parameter processing are required:
#[jni_fn("dev.yidafu.swc.SwcNative")]
pub fn parseSync(mut env: JNIEnv, _: JClass, code: JString, options: JString, filename: JString) -> jstring {
// process parameter
let src: String = env
.get_string(&code)
.expect("Couldn't get java string!")
.into();
let opts: String = env
.get_string(&options)
.expect("Couldn't get java string!")
.into();
let filename: String = env
.get_string(&filename)
.expect("Couldn't get java string!")
.into();
// ...
// process return value
let output = env
.new_string(ast_json)
.expect("Couldn't create java string!");
output.into_raw()
}
Getting a string passed by the JVM requires a call to get_string
of JNIEnv
.
Converting a Rust string to a Java string also requires a call to new_string
of JNIEnv
before converting to a jstring
type.
If SWC fails to process JS code (e.g. JS code has syntax errors), it needs to throw an exception to the JVM, which will be handled by the JVM side.
The code thrown by Rust is first caught and then converted into an exception thrown by the JVM.
The binding_core_node
handler implements the MapErr<T>
trait for Result
, which converts the Rust exception to a napi
exception via the convert_err
method, and finally throws it in the Node.
Exception handling in SWC [swc/bindings/binding_core_node/src/parse.rs#L179](https://github.com/swc-project/swc/blob/ 828190c035d61e6521280e2260c511bc02b81327/bindings/binding_core_node/src/parse.rs#L179)
let program = try_with(c.cm.clone(), false, ErrorFormat::Normal, |handler| {
// ....
}).convert_err()?;
We need to throw JVM exceptions, so implement the JVM's MapErr<T>
trait to turn Rust exceptions into jni
exceptions for jni
to throw to the JVM.
Copy the SWC's MapErr<T>
trait.
pub trait MapErr<T>: Into<Result<T, anyhow::Error>> {
fn convert_err(self) -> SwcResult<T> {
self.into().map_err(|err| SwcException::SwcAnyException {
msg: format!("{:?}", err),
})
}
}
Result
implements MapErr<T>
.
impl<T> MapErr<T> for Result<T, anyhow::Error> {}
Here jni
throws an exception and it should be noted that the function still needs to return a value, usually an empty string. Here jni-rs#76 explains why.
You still have to return to the JVM, even if you've thrown an exception. Remember that unwinding across the ffi boundary is always undefined behavior, so any panics need to be caught and recovered from in your extern functions.
The final exception is handled like this
let result = try_with(c.cm.clone(), false, ErrorFormat::Normal, |handler| {
// ...
}).convert_err();
match result {
Ok(program) => {
// ...
}
Err(e) => {
match e {
SwcException::SwcAnyException { msg } => {
env.throw(msg).unwrap();
}
}
return JString::default().into_raw();
}
}
Implementation of Rust compiled into a dynamic library, the next step will need to implement the JVM side of the glue code, the following is the Kotlin implementation.
class SwcNative {
init {
System.loadLibrary("swc_jni")
}
@Throws(RuntimeException::class)
external fun parseSync(code: String, options: String, filename: String?): String
}
When the JVM loads swc_jni
, it looks for dynamic libraries from the filesystem as a rule, but not from the resources
directory of the jar. So, by System.loadLibrary("swc_jni")
if there is no swc_jni
dynamic library locally, it will fail to load. The user installs from maven and there is definitely no swc_jni
locally.
Solution, refer to this answer Load Native Library from Class path, if System.loadLibrary("swc_jni")
fails to load, then copy the jar's dynamic library to a temporary directory and load it again.
init {
try {
System.loadLibrary("swc_jni")
} catch (e: UnsatisfiedLinkError) {
// 加载失败,复制DLL到临时目录
val dllPath = DllLoader.copyDll2Temp("swc_jni")
// 再次加载
System.load(dllPath)
}
}
Like the other methods just implement them like parse_sync
.
At this point we can compile JS in the JVM.
SwcNative().parseSync(
"var foo = 'bar'",
"""{"syntax": "ecmascript";}""",
"test.js",
)
output string
{
"type": "Module",
"span": {
"start": 0,
"end": 15,
"ctxt": 0
},
"body": [
{
"type": "VariableDeclaration",
"span": {
"start": 0,
"end": 15,
"ctxt": 0
},
"kind": "var",
"declare": false,
"declarations": [
{
"type": "VariableDeclarator",
"span": {
"start": 4,
"end": 15,
"ctxt": 0
},
"id": {
"type": "Identifier",
"span": {
"start": 4,
"end": 7,
"ctxt": 2
},
"value": "foo",
"optional": false,
"typeAnnotation": null
},
"init": {
"type": "StringLiteral",
"span": {
"start": 10,
"end": 15,
"ctxt": 0
},
"value": "bar",
"raw": "'bar'"
},
"definite": false
}
]
}
],
"interpreter": null
}
Now that we get the AST JSON string, it is still inconvenient if we want to manipulate the AST. We need the JSON string to convert it to a class so that traversing and modifying it will be easy.
Also, the second options
of parseSync
is not type aware and needs to be constrained to a configuration item.
So how do we implement type descriptions for SWC ASTs and configuration item parameters in Kotlin?
I've tried an AI conversion from Rust to Kotlin and it works pretty well. The only problem is that it requires kryptonite, and I admit that lack of money is my problem.
Writing SWC class definitions from scratch? I'm afraid there's a lot of work to be done, SWC has 200+ ASTs and configuration item types.
The best solution is to generate Kotlin classes via scripts. As it happens, SWC provides the TS declaration file @swc/types.
When you open the declaration file for @swc/types
, it is full of type
and interface
declarations with a very simple structure.
It can be divided into the following cases.
- type alias
- literal union type:
type T = 'foo' | 'bar'
- primary union type:
type T = string | number
- type alias and object literal type:
type T = S & { foo: string }
- type alias union type:
type T = S | E
- literal union type:
- interface
The case of Type alias is relatively complex, mainly because of the flexibility of JS.
For some special cases we need to reduce the dynamics of types to make it easier for us to work with them.
Like T | T[]
we can convert to T[]
to avoid not being able to define the type in Kotlin.
For example:
export interface Config {
test?: string | string[];
// ...
}
Just convert:
class Config {
var test: Array<String>? = null
}
A literal union type like props: 'foo' | 'bar'
should be converted directly to the base type: val props: String?
.
A type T = S & { foo: string }
requires that the object literal type be extracted as a separate type, with T inheriting from S and the extracted new type. Conversion to kotlin should look like this:
interface BaseT {
val foo: String;
}
class T : S, BaseT {}
For interface
processing, it is divided into 3 parts: 1. TS interface to Kotlin class; 2. inheritance; 3. serialization.
Define a KotlinClass
to represent the Kotlin class to be converted. Implement toString()
to convert it to a Kotlin class.
export class KotlinClass {
klassName: string = '';
headerComment: string = ''
annotations: string[] = []
modifier: string = ''
parents: string[] = []
properties: KotlinClassProperty[] = []
}
The KotlinClass
is generated by traversing the AST of the TS interface.
When traversing interface properties, you need to recursively traverse the properties of the parent class. Properties inherited from the parent type need to set KotlinClassProperty.isOverride
to true to facilitate the generation of kotlin classes with the override
modifier.
class KotlinClassProperty {
modifier: string = 'var'
name: string = ''
type: string = ''
comment: string = ''
defaultValue: string = ''
isOverride: boolean = false
discriminator: string = 'type'
}
The parent interface from which the TS interface directly inherits is simply added to the KotlinClass.parents
array.
However, type T = S | E
needs to be handled separately.
As an example
export interface VariableDeclarator extends Node, HasSpan {
init?: Expression;
// other props...
}
export type Expression =
| ThisExpression
| ArrayExpression
| ....
export interface ArrayExpression extends ExpressionBase {
// ...
}
Here Expression is the parent of all XxxExpression
. This makes variableDeclarator.init = thisExpression
or variableDeclarator.init = arrayExpression
assignments legal.
Because Expression
is a type alias in TS, converting kotlin turns it into an empty interface. Converting to Kotlin results in something like this
interface Expression {}
class VariableDeclarator : Node, HasSpan {
val init: Expression?;
// other props...
}
class ArrayExpression : ExpressionBase, Expression {
// ...
}
So, for type T = S | E
, T
is the parent of S
and E
, and T
needs to be added to the KotlinClass.parents
array of S
,E
.
When serializing AST nodes, one encounters problems with polymorphic serialization.
For example, serialize Expression
, and Expression
is an empty interface, then toJson
doesn't know how to deal with ThisExpression
and ArrayExpression
properties, and then it can only throw an exception or output an empty object, which don't meet our expectation.
val thisExpression: ThisExpression = ThisExpression()
val arrayExpression: ArrayExpression = ArrayExpression()
var expression: Expression = thisExpression
toJson(expression)
expression = arrayExpression
toJson(expression)
The same goes for deserialization. parseJson
also doesn't know whether to convert a string to ThisExpression
or ArrayExpression
.
val thisExpression = """ {"type":"ThisExpression", "props": "any value" } """
val arrayExpression = """ {"type":"ThisExpression", "elements": [] } """
var expression: Expression = parseJson(thisExpression)
var expression: Expression = parseJson(arrayExpression)
Serialization using kotlinx serialization, which supports polymorphic serialization, requires transforming the kotlin code.
Annotate the class with JsonClassDiscriminator
to indicate by which field the type is distinguished, and SerialName
to indicate the name of the type after serialization. Deserialization can find the specific type based on this type name.
interface ArrayExpression : ExpressionBase, Expression {
// ....
}
@Serializable
@JsonClassDiscriminator("type")
@SerialName("ArrayExpression")
class ArrayExpressionImpl : ArrayExpression {
// ...
}
interface ThisExpression : ExpressionBase, Expression {
// ....
}
@Serializable
@JsonClassDiscriminator("type")
@SerialName("ThisExpression")
class ThisExpressionImpl : ThisExpression {
// ....
}
In order for serialization and deserialization to be able to correctly find specific types, it is also necessary to define SerializersModule
.
val swcSerializersModule = SerializersModule {
// ...
polymorphic(Expression::class) {
subclass(ThisExpressionImpl::class)
subclass(ArrayExpressionImpl::class)
// ...
}
polymorphic(ThisExpression::class) {
subclass(ThisExpressionImpl::class)
}
polymorphic(ArrayExpression::class) {
subclass(ArrayExpressionImpl::class)
}
// ...
}
This allows normal serialization of polymorphic types
val json = Json {
classDiscriminator = "syntax"
serializersModule = configSerializer
}
json.decodeFromString<Expression>(""" {"type":"ThisExpression", "elements": [] } """)
val arrayExpression: Expression = ArrayExpression()
json.encodeToString<Expression>(arrayExpression)
We have generated class definitions for ASTs and configuration items, and would find it less elegant and convenient to build configuration or ASTs directly using classes.
const foo = 'bar'
SWC compile output string
{
"type": "VariableDeclaration",
"span": {
"start": 0,
"end": 17,
"ctxt": 0
},
"kind": "const",
"declare": false,
"declarations": [
{
"type": "VariableDeclarator",
"span": {
"start": 6,
"end": 17,
"ctxt": 0
},
"id": {
"type": "Identifier",
"span": {
"start": 6,
"end": 9,
"ctxt": 2
},
"value": "foo",
"optional": false,
"typeAnnotation": null
},
"init": {
"type": "StringLiteral",
"span": {
"start": 12,
"end": 17,
"ctxt": 0
},
"value": "bar",
"raw": "'bar'"
},
"definite": false
}
]
}
The JS code above, if we Kotlin build the AST
VariableDeclarationImpl().apply {
span = Span(0, 17, 0)
kind = 'const'
declare = false
declarations = arrayOf(
VariableDeclaratorImpl().apply {
span = Span(6, 17, 0)
id = IdentifierImpl().apply {
span = span(5, 9, 0)
value = "foo"
}
init = StringLiteralImpl().apply {
span = Span(12,17, 0)
value = "bar"
raw = "'bar'"
}
}
)
}
Simplified property settings are invoked via apply
. Relative to spaghetti code, it's already cleaner via apply
. It could be a bit more succinct.
variableDeclaration {
span = span(0, 17, 0)
kind = 'const'
declare = false
declarations = arrayOf(
variableDeclaratorImpl {
span = span(6, 17, 0)
id = identifier {
span = span(5, 9, 0)
value = "foo"
}
init = stringLiteral {
span = span(12,17, 0)
value = "bar"
raw = "'bar'"
}
}
)
}
DSL is now very much like outputting AST JSON and is very simple and straightforward to write.
Classes that require DSL writing require the SwcDslMarker
annotation marker. The SwcDslMarker
is mainly to restrict the scope and avoid accessing the outer scope.
@DslMarker
annotation class SwcDslMarker
@SwcDslMarker
class VariableDeclarationImpl {
// ...
}
fun variableDeclaration(block: VariableDeclaration.() -> Unit): VariableDeclaration {
return VariableDeclarationImpl().apply(block)
}
You can refer to the official documentation for how to implement it: kotlin -- Type-safe builders
interface VariableDeclarator : Node, HasSpan {
val init: Expression?;
// other props...
}
For the VariableDeclarator
interface, its init field type is Expression
, meaning that its right value can be any subtype of arrayExpression
, thisExpression
, and so on.
variableDeclarator {
init = arrayExpression { ... }
// or
init = thisExpression { ... }
}
So for VariableDeclarator
it should have methods to create all Expression
subclasses. The creation of Expression
subclasses is added by extending the function to do so.
When we parse @swc/types
declaration file, we need to check the type of the attribute, if it is converted to Kotlin and is a class, then find out all its non-intermediate subclasses, and then generate extension functions for it.
fun VariableDeclarator.arrayExpression(block: ArrayExpression.() -> Unit): ArrayExpression {
return ArrayExpressionImpl().apply(block)
}
This allows the Expression
class to be constructed from the arrayExpression {}
function in variableDeclarator {}
.
There's another special case to deal with here. TemplateLiteral
conflicts with TsTemplateLiteralType
, whose type
is "TemplateLiteral"
. This makes DSL-built ASTs unserializable. See the definition of a structure in rust.
// https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/crates/swc_ecma_ast/src/typescript.rs#L823
#[ast_node("TemplateLiteral")]
#[derive(Eq, Hash, EqIgnoreSpan)]
#[cfg_attr(feature = "arbitrary", derive(arbitrary::Arbitrary))]
pub struct TsTplLitType {
// ...
}
// https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/crates/swc_ecma_ast/src/expr.rs#L1060
#[ast_node("TemplateLiteral")]
#[derive(Eq, Hash, EqIgnoreSpan)]
#[cfg_attr(feature = "arbitrary", derive(arbitrary::Arbitrary))]
pub struct Tpl {
pub span: Span,
#[cfg_attr(feature = "serde-impl", serde(rename = "expressions"))]
pub exprs: Vec<Box<Expr>>,
pub quasis: Vec<TplElement>,
}
These two types need to be handled separately and not generated by a script.
Implement both TemplateLiteral
, TsTemplateLiteralType
by one class. When used, it is then up-converted to TemplateLiteral
, TsTemplateLiteralType
.
// ignore annotation
interface TemplateLiteral : ExpressionBase, Expression {
var expressions: Array<Expression>?
var quasis: Array<TemplateElement>?
override var span: Span?
}
interface TsTemplateLiteralType : Node, HasSpan, TsLiteral {
var types: Array<TsType>?
var quasis: Array<TemplateElement>?
override var span: Span?
}
class TemplateLiteralImpl : TemplateLiteral, TsTemplateLiteralType {
override var types: Array<TsType>? = null
override var expressions: Array<Expression>? = null
override var quasis: Array<TemplateElement>? = null
override var span: Span? = null
}
typealias TsTemplateLiteralTypeImpl = TemplateLiteralImpl
Now we can upgrade the parseSync
signature.
@Throws(RuntimeException::class)
fun parseSync(code: String, options: ParserConfig, filename: String?): Program
Type safety and type hints are now guaranteed when used.
const program = SwcNative().parseSync(
"""
function App() {
return <div>App</div>
}
""".trimIndent(),
esParseOptions {
jsx = true
target = "es5"
},
"temp.js"
)
if (program is Module) {
if (program.body?.get(0) is FunctionDeclaration) {
// ...
}
}
Here, we have explained the idea and core implementation points of SWC JVM binding: 1. SWC supports JNI; 2. AST JSON is serialized into Kotlin classes; 3. ASTs and configurations are described through DSL.
Some details are not covered, such as the handling of boundary cases in Kotlin generated scripts, Rust cross-compilation, etc. For more details, you can read the source code. If you are interested in the details, you can read the source code yidafu/swc-binding.
If you need to compile JS in the JVM, SWC JVM binding has been released to the Maven central repository, use dev.yidafu.swc:swc-binding:0.5.0. swc-binding)
For other questions, feel free to mention Issue.
Thinking never ends.