Background
We want to converge the current tile-level DSL and vector-level DSL into one unified TileLang DSL stack, instead of maintaining two partially overlapping authoring systems.
Today the split is roughly this:
pto-dsl carries a substantial amount of tile-level semantics and a JIT-oriented workflow
tilelang_dsl carries the current vector-level / VPTO-oriented frontend path inside PTOAS and already participates in TileOp expansion / VPTO authoring
This split is increasingly awkward for both users and compiler engineering:
- tile-level and vector-level authoring are not expressed in one coherent DSL architecture
- similar semantics are spread across two different codebases and two different programming models
- the long-term backend direction is unclear because the two DSLs were built on different technical foundations
- frontend, lowering backend, and runtime/JIT concerns are not cleanly separated today
The goal of this proposal is therefore not just “add one more frontend feature” or “add JIT”.
The goal is to define a unified architecture where:
- tile-level and vector-level semantics live in the same DSL stack
- there is one clear public programming model
- frontend work and backend work can proceed independently
- JIT can later be added on top of the same compilation pipeline instead of becoming a separate system
Main difficulties
There are two architectural difficulties that must be resolved first.
1. pto-dsl and tilelang_dsl use different technical routes
The two DSLs were built with different assumptions:
pto-dsl is fundamentally IR-builder oriented
tilelang_dsl is fundamentally AST-oriented
This means we do not just have two implementations of similar semantics; we have two competing frontend models.
Before we can merge tile-level and vector-level DSL capabilities, we must make a deliberate technical choice about the unified route:
- Should the long-term public DSL be builder-first?
- Or should it be AST-first, with builder as an internal implementation backend?
Without resolving this, any “merge” would only move code around while preserving the underlying split.
2. current tilelang_dsl still uses a text-emission backend and frontend/backend coupling is too strong
Current tilelang_dsl lowering is still centered on emitting authoring-form VPTO MLIR text.
That creates two related problems:
- there is no proper pybinding / MLIR-builder backend yet
- frontend semantics and backend emission logic are too tightly coupled
In practice, this means:
- frontend and lowering evolution interfere with each other
- changing the backend implementation is expensive because the frontend is not isolated from emission details
- JIT cannot be cleanly layered on top, because there is not yet a stable compile-driver boundary
So even before discussing JIT, we need a clearer split between:
- frontend language / semantics
- backend MLIR construction
- compile driver / runtime launch
Why choose an AST-first route
The proposed direction is:
- keep AST as the only primary public frontend
- use MLIR builder / pybinding as the target lowering backend
This is not a rejection of builder infrastructure. It is a decision about the user-facing language boundary.
Why AST is the better unification route
If we adopt a builder-first public route, then the user-facing DSL becomes tightly coupled to IR-construction mechanics:
- authoring style starts to resemble “writing a Python script that builds compiler IR”
- frontend evolution becomes tied to backend builder APIs
- syntax sugar, template-style authoring, type-directed specialization, and source diagnostics become harder to manage as language features
- merging tile-level and vector-level authoring still leaves us with a compiler-API-shaped user model
An AST-first route gives us a cleaner long-term unification point:
- users write kernel semantics, not IR construction steps
- tile-level and vector-level constructs can be expressed in one source language
- templates, polymorphism, symbolic analysis, and source-level diagnostics stay in the frontend
- the frontend can lower into a backend-neutral semantic representation
- the backend can evolve from text emission to pybinding builder without changing the public programming model
What AST does not mean
Choosing AST does not mean:
- moving optimization logic into Python
- replacing MLIR/PTO/VPTO/LLVM passes with frontend logic
- devaluing builder-based infrastructure
Most real optimization should still remain in:
- PTO / VPTO passes
- MLIR canonicalization / cleanup
- LLVM / Bisheng backend
The value of AST here is mainly:
- one stable public language surface
- frontend/backend decoupling
- a better basis for unifying tile-level and vector-level authoring
Proposal: how to solve the coupling problem
The key architectural move is to introduce a backend-neutral semantic layer between the AST frontend and all backend/runtime work.
Target architecture:
User API
-> tilelang_dsl AST surface
-> Frontend Analyzer
-> Canonical Semantic Module
-> MLIR Text Emitter (legacy / fallback)
-> MLIR Builder Emitter (target)
-> Template Instantiation Service
-> Compile Driver
-> ptoas VPTO LLVM path
-> Bisheng / cce-ld
-> Runtime Launcher / JIT
Core rule
The Canonical Semantic Module becomes the only internal contract shared by all workstreams.
That gives us a clean separation of responsibilities:
- Frontend owns parsing, semantic analysis, specialization, source diagnostics, and language sugar
- Emitters own conversion from semantic IR to MLIR
- Compile driver owns artifact generation through the current VPTO LLVM path
- Runtime/JIT owns caching, load, argument bridging, stream handling, and launch
Practical workstream split
A. frontend/backend decoupling first
This is the highest-priority step.
We should explicitly split tilelang_dsl into:
frontend_ast
semantic_model
emitters
compile_driver/runtime
This is the prerequisite for every later step.
B. import pto-dsl semantics into the AST + semantic layer
pto-dsl should contribute:
- tile-level semantic coverage
- module organization ideas
- entry metadata / multi-function module patterns
But it should not become the primary public builder-style frontend.
C. build a real pybinding backend behind the semantic layer
The current text emitter can remain temporarily as:
- a reference path
- a fallback path
- a debugging aid
But the target direction should be:
SemanticModule -> MLIR builder / pybinding as the official lowering backend
D. add JIT after the boundary is stable
JIT should be layered on top of the compile driver, not fused into the frontend.
That way:
- JIT reuses the same frontend and lowering stack
- JIT does not become a second architecture
- compile-only and compile-and-run can share the same artifact model
Real compile chain we should design for
The architecture must align with the actual current PTOAS VPTO LLVM path:
AST / semantic frontend
-> MLIR
-> ptoas --pto-backend=vpto --vpto-emit-hivm-bc|llvm
-> LLVM IR / bitcode
-> Bisheng device object
-> fatobj / kernel.so packaging
-> Python load + launch
This proposal does not assume the old kernel.cpp / caller.cpp path as the long-term primary model.
Expected outcome
If we follow this route, we get:
- one unified DSL architecture for tile-level and vector-level authoring
- one stable public programming model based on AST
- one backend-neutral semantic contract for frontend/backend parallel work
- one clean migration path from text emission to pybinding builder lowering
- one future JIT path built on top of the same compile driver instead of a separate runtime stack
Non-goals
This proposal is not trying to:
- rewrite everything in one patch
- remove the current text emitter immediately
- promise full public
ptodsl API compatibility
- define every runtime ABI detail in this issue
- claim that AST itself is the main optimization engine
Discussion points for the team
- Do we agree that the real background problem is unifying tile-level DSL and vector-level DSL into one architecture, not merely adding another frontend feature?
- Do we agree that the first architectural conflict to resolve is IR-builder route vs AST route?
- Do we agree that the second architectural blocker is the lack of a proper pybinding backend and the current frontend/backend coupling inside
tilelang_dsl?
- Do we agree that AST should remain the only primary public frontend, while builder becomes the internal lowering backend?
- Do we agree that the semantic module should be the only contract between frontend, backend, and future JIT/runtime work?
Related issues
This issue is intended to act as the top-level architecture discussion that explains how these tracks fit together.
Background
We want to converge the current tile-level DSL and vector-level DSL into one unified TileLang DSL stack, instead of maintaining two partially overlapping authoring systems.
Today the split is roughly this:
pto-dslcarries a substantial amount of tile-level semantics and a JIT-oriented workflowtilelang_dslcarries the current vector-level / VPTO-oriented frontend path inside PTOAS and already participates in TileOp expansion / VPTO authoringThis split is increasingly awkward for both users and compiler engineering:
The goal of this proposal is therefore not just “add one more frontend feature” or “add JIT”.
The goal is to define a unified architecture where:
Main difficulties
There are two architectural difficulties that must be resolved first.
1.
pto-dslandtilelang_dsluse different technical routesThe two DSLs were built with different assumptions:
pto-dslis fundamentally IR-builder orientedtilelang_dslis fundamentally AST-orientedThis means we do not just have two implementations of similar semantics; we have two competing frontend models.
Before we can merge tile-level and vector-level DSL capabilities, we must make a deliberate technical choice about the unified route:
Without resolving this, any “merge” would only move code around while preserving the underlying split.
2. current
tilelang_dslstill uses a text-emission backend and frontend/backend coupling is too strongCurrent
tilelang_dsllowering is still centered on emitting authoring-form VPTO MLIR text.That creates two related problems:
In practice, this means:
So even before discussing JIT, we need a clearer split between:
Why choose an AST-first route
The proposed direction is:
This is not a rejection of builder infrastructure. It is a decision about the user-facing language boundary.
Why AST is the better unification route
If we adopt a builder-first public route, then the user-facing DSL becomes tightly coupled to IR-construction mechanics:
An AST-first route gives us a cleaner long-term unification point:
What AST does not mean
Choosing AST does not mean:
Most real optimization should still remain in:
The value of AST here is mainly:
Proposal: how to solve the coupling problem
The key architectural move is to introduce a backend-neutral semantic layer between the AST frontend and all backend/runtime work.
Target architecture:
Core rule
The Canonical Semantic Module becomes the only internal contract shared by all workstreams.
That gives us a clean separation of responsibilities:
Practical workstream split
A. frontend/backend decoupling first
This is the highest-priority step.
We should explicitly split
tilelang_dslinto:frontend_astsemantic_modelemitterscompile_driver/runtimeThis is the prerequisite for every later step.
B. import
pto-dslsemantics into the AST + semantic layerpto-dslshould contribute:But it should not become the primary public builder-style frontend.
C. build a real pybinding backend behind the semantic layer
The current text emitter can remain temporarily as:
But the target direction should be:
SemanticModule -> MLIR builder / pybindingas the official lowering backendD. add JIT after the boundary is stable
JIT should be layered on top of the compile driver, not fused into the frontend.
That way:
Real compile chain we should design for
The architecture must align with the actual current PTOAS VPTO LLVM path:
This proposal does not assume the old
kernel.cpp/caller.cpppath as the long-term primary model.Expected outcome
If we follow this route, we get:
Non-goals
This proposal is not trying to:
ptodslAPI compatibilityDiscussion points for the team
tilelang_dsl?Related issues
[Architecture] TileLang DSL lowering backend 渐进迁移到 pybinding builder[Architecture] VPTO IR cache 机制应迁移到 DSL 框架侧epic(tilelang-dsl): complete cube matrix tileop templates and ST coverageThis issue is intended to act as the top-level architecture discussion that explains how these tracks fit together.