This repository contains three projects:
monads
: a small library providing a number of monads used by the Sail backend.nanosail
: relies onmonads
and contains all of the actual translation logic.sail_plugin
: a very small project that importsnanosail
and registers itself with Sail as a plugin.
Note regarding dependencies
When Sail loads the plugin, it does not automatically also load the dependencies needed by this plugin.
It is therefore necessary to statically link nanosail's dependencies.
See the dune
file for the plugin project (embed_in_plugin_libraries
).
The translation takes place in two phases:
- Sail is translated into nanosail, an intermediate language bridging Sail and muSail.
- Nanosail is translated into muSail: user-provided templates describe which translation belongs where in which file.
This top level logic resides in the sail_plugin
project.
The Ast
module contains all definitions related to the nanosail intermediate language.
When Sail and muSail disagree in their representations, nanosail tends to side with muSail.
- Nanosail makes the distinction between expressions and statements the same way muSail does.
- Nanosail keeps track of some extra information that cannot be directly translated to muSail, such as numeric expressions and constraints, which come in handy for monomorphization.
- Similarly to Sail, a nanosail program (see
Ast.Program.t
) consists of a series of definitions. SeeAst.Definition
for which types of definitions exist.
All Sail to nanosail translation logic is grouped in the SailToNanosail
module.
To assist in the translation, we defined a monad TranslationContext
, which is basically a combination of the State and the Result monads.
It provides the following functionality:
- It can generate unique identifiers using
generate_unique_identifier
andgenerate_unique_identifiers
. - It keeps track of all translated definitions.
store_definition
adds a new definition to the list.lookup_definition
/lookup_definition_opt
looks for a definition that satisfies a predicate. Useful predicates are readily available inAst.Definition.Select
.- Other helper functions such as
lookup_register_type
,is_register
,lookup_variant_by_constructor
are also available.
- It keeps track of calls made to polymorphic functions (
register_polymorphic_function_call_type_arguments
), which is useful for reporting them to the user so that they know which monomorphizations to ask for. - There are two types of failures:
NotYetImplemented
andFailure
- A
Failure
causes the translation to be halted and should be used sparingly. - Each Sail definition is translated in turn.
When a
NotYetImplemented
is signaled, the current definition being translated is given up on: aAst.Definition.UntranslatedDefinition
gets registered and the translation process moves on to the next definition.
- A
All nanosail to muSail functionality can be found in the NanosailToSail
module.
Similarly to the first translation phase, we defined a monad, named GenerationContext
, which provide us with some useful functionality:
- It holds a list of all definitions gathered during the first phase.
These can be looked up with helper functions such as
select_definitions
andlookup_definition_opt
. - Translations can be annotated in multiple ways:
- First, generated code can be annotated with information about which functions
were involved in the generation of it (a bit like a stack trace).
This information is not inherent to
GenerationContext
, but is functionality provided byDocument
, see below. - Generated code can be grouped in blocks. A block can be annotated with comments and annotations. See section below.
- First, generated code can be annotated with information about which functions
were involved in the generation of it (a bit like a stack trace).
This information is not inherent to
Generated code can be annotated with comments:
(* Some comments *)
Some generated code
This can be achieved by defining a block and making a call to add_comment
inside of it:
GenerationContext.block begin
let* () = add_comment @@ PP.string "Some comments"
in
PP.string "Some generated code"
end
add_comment
will attach comments to the current block.
Blocks can be nested.
Comments and annotations are always attached to the current innermost block.
One can also add annotations, which are basically numbered comments:
(*
Some comments
[1] An annotation
[2] Another annotation
*)
Some generated code, a reference to annotation 1 and annotation 2.
is generated by
GenerationContext.block begin
let* () = add_comment @@ PP.string "Some comments"
and* ref1 = GenerationContext.add_annotation @@ PP.string "An annotation"
and* ref2 = GenerationContext.add_annotation @@ PP.string "Another annotation"
in
PP.format "Some generated code, a reference to annotation %d and annotation %2" ref1 ref2
end
These numbered annotations are often used to leave "not-yet-implemented holes" in the translation:
let translate_bool (b : bool) : PP.t GenerationContext.t =
match b with
| true -> GenerationContext.return @@ P.string "true"
| false -> GenerationContext.not_yet_implemented [%here]
The Logging
module can be used to log messages.
We currently support four levels (error, warning, info, debug).
Using the environment variable VERBOSE
one can choose
up to which level log messages should be shown:
VERBOSE=0
: quiet mode, no logging takes place.VERBOSE=1
: only errors are shown.- ...
VERBOSE=4
: all messages are printed.
OCaml does not provide a standard way of converting values to strings.
Originally we intended to implement all kinds of string_of_
functions
but the one dimensional nature of strings impeded readability.
F-Expressions are more structured than plain strings and allow for better pretty printing.
Apart from integers, booleans, strings and lists,
F-expressions have function application written
Head[pos1, pos2, ..., keyword1=val1, keyword2=val2], ...
.
See the FExpr
module for more details.
We defined to_fexpr
functions for most types, which can be helpful for debugging.
A FExpr.pp_diff
function is available to highlight differences
between two F-expressions, which is useful while testing.
The PP
module offers limited pretty print functionality.
Originally, the more powerful PPrint
module was used, but we wanted to be able
to annotate each generated piece of text with information about which
functions were responsible for its generation.
Later, we also added support for decoration: foreground/background colors, bold, underline, ...
horizontal [
string "Charles";
space;
vertical [ string "Jason"; string "Henry" ]
]
gives
Charles Jason
Henri
This module was originally meant to group all string_of_
functionality
but ultimately ended up only providing this functionality for Sail types.
Most of the functionality has been taken straight from Libsail
.
Having our own module, however, allowed us to add string conversions
for types neglected by Libsail
(e.g. aval
) and customize the conversion
if we felt the Libsail
implementation fell short.
For our own types, we preferred defining to_fexpr
functions in their respective modules,
e.g., Ast.Type.to_fexpr
for nanosail types.
Below we justify our coding style, which may be considered to be rather unidiomatic.
We shied away from using wildcards in match expressions, instead opting to handle each case explicitly. Whenever a type is extended with a new case (something that was expected to occur frequently as development progressed), a simple rebuild would lead the compiler to point out all the locations in the code that required an update to deal with this new case.
Note This has somewhat been thwarted by what seems to be a bug in the OCaml compiler: it happened more than once that the build process simply got stuck until typing errors were fixed.
Many types are defined in their own personal module. We felt this led to much cleaner code:
- No prefixes necessary for constructor:
E_Variable
but simplyVariable
. - Clearer names:
Ast.Expression.Variable
instead ofE_Variable
. - Typing context often allows us to omit the
Ast.Expression.
prefix in many cases such as in match patterns. open Ast.Expression
is always available if one gets tired of the long names.- Related functionality can be bundled in the same module, e.g.,
Ast.Variable.equal
. - More consistent naming for related functions, i.e., all equality functions can be called
equal
.
Given that the standard equality operator =
cannot be customized for specific types,
it seemed better to eschew it completely.
Instead, we chose to define equal
function for each type, accepting
that large amounts boilerplate code is the cost for more robust code.
We chose to annotate most of our code with types.
- In our opinion, it leads to clearer code.
- It avoids issues with records with share field names.
- Our functions often receive "too many" parameters, i.e., they were made to receive all information that was available, not just information that was actually required. We chose this approach due to the very incremental/iterative nature of the development process: the first implementation of a function would contain only very specific functionality, and it was unclear what information would be necessary when the function would later be fleshed out more. Having functions receive all available information from the get-go made it much easier when we later had to revisit the function to extend its functionality, as we had a clear view of all available puzzle pieces.