Skip to content

Jtlc concepts

MaxMotovilov edited this page Jun 17, 2011 · 10 revisions

How does the compiler work?

Javascript Template Language Compiler (acronym jtlc is always in lowercase to compensate for this pomposity of a name) is the low-level facility underlying CHT, Q+ and JXL. A CHT user need not be concerned with the jtlc API since higher level facilities obviate the need to access it, yet jtlc design decisions permeate the entire architecture of the library and make understanding of the compiler’s operation a prerequisite for success.

Usage model

The jtlc is designed around the two-stage model of execution: first, the template is compiled into a Javascript function; next, the resulting evaluator function is executed as many times as necessary on different inputs. Nature of the input and output data for the template depend on the language, although most of the available facilities operate on what is essentially JSON data: anonymous Javascript objects, arrays and values.

The compiler API is represented by a single function, dojox.jtlc.compile(), that expects an abstract syntax tree as its input. CHT (and Q+) have to be parsed to produce such a tree, while JXL “programs” are consumed by the jtlc in their original raw form.

Dataflow paradigm

The functionality of jtlc and the template languages it supports is easiest to describe in terms of a dataflow execution model where values are generated by the leaves of the AST and flow towards the root of the tree through functional nodes to end up in sinks. The root of the tree serves as a top-level sink, the value of which becomes the return value of the entire template (i.e. of its evaluator function).

A jtlc subtree may have one of two distinct modes of execution — singleton or iterative — determined unambiguously from the context in which the subtree appears. Some of the functional nodes (referred to as tags in JXL) may function in either mode, others require a specific mode. The context is defined by certain tags, that control mode of execution for their child subtrees. Some of these tags (group, each) may be thought of as looping constructs.

Singleton contexts

A subtree appearing in a singleton context produces a single value. Every node in such a tree behaves as a function — insofar as its execution has no side effects. The entire template is implicitly in a singleton context with the final return statement of the evaluator as its sink.

Iterative contexts

Iterative contexts are equivalent to loops over the entire sequence of values generated by a single node (typically the bottom-left corner of the subtree in question, as the trees are usually drawn, or the first stage of the Q+ pipeline). The sink above the subtree is responsible for the ultimate disposal of processed values — usually pushing them into an array, though it is also possible to populate dictionaries or aggregate the sequence into a single value. Sinks may even have multiple subtrees connected to them, some of them iterative and others singletons.

Value generation is achieved by running a query on the input data or, in simpler cases, enumerating a specific array or dictionary contained within.

Current input

Yet another context-dependent parameter is current input: the root of the data hierarchy that current subtree operates upon. Current input is set to the value of the first argument passed into the call to compiled template but tags may locally change it for certain of their child subtrees. Note that compiled templates may accept and operate upon multiple arguments, the first one serving as current input for the template.

Access to the value associated with current input is gained via a leaf tag current ($ in Q+) which also serves as a generator in iterative contexts. Most tags default to using current when no explicit arguments are provided.

Inline expressions

JXL and Q+ contain a facility (tag expr) for embedding parameterized Javascript expressions into the template. Their use and parameter substitution syntax is very similar to that of JSON Query. Note however that actual JSON Queries embedded into a JXL program compile into evaluators of their own while expr tags inject code directly into the compiled template; this may change in subsequent versions of the library.

Important implementation traits

Advanced matter follows, feel free to skip it at first reading.

Extensibility

At its core, the compilation by jtlc is performed in a parent-first recursive descent of the input tree. All tag nodes provide a compile() method which is called as a visitor on the object encapsulating the compiler state: current output, expression stack, dictionaries for local and global variables etc.; tags are free to inject additional properties into it as they see fit. Additional tags can generally be created without any impact on the rest of the compiler and don’t have to be specially registered with jtlc. The language description object (second argument of the dojox.jtlc.compile()) determines how the literal values — strings, numbers, functions and objects without the compile method — are interpreted and carries global settings that certain tags may depend upon; the compiler state uses this object as a prototype. Constructors for language description objects usually accept a bag of properties to be mixed into the instance providing an additional shortcut for customization.

Performance

The primary reason behind the two-stage execution model is performance: compiled templates should generally compare well to handwritten Javascript code in the efficiency of execution. Care is taken to avoid or minimize spurious copying, use efficient loop constructs, evaluate complex subexpressions once etc. All data structures maintained during the execution of the template remain under direct control of the template’s programmer: the compiler introduces only local variables and no additional arrays or objects (CHT also injects its own framing code into the evaluator, making it somewhat more complex).

In order to minimize re-evaluation or even replication of complex fragments, the compiler maintains a dictionary of “global variables”: in fact, those variables are injected into a closure around the resulting compiled function so that their values are effectively evaluated only once during the compilation. For example, all JSON Queries used by the template become compiled functions referenced from within the closure.

An extra optimization step is provided after the compilation is finished, it is performed on the resulting Javascript code still in string representation. Any tag may register an optimization callback with the compiler instance in order to transform the string that’s about to become the body of the compiled template. These callbacks usually remove redundancies that can be pinpointed with a regex search. Surprisingly enough, this simple procedure proves very effective in reducing spurious copying and compacting the code.