Skip to content

Commit 365663f

Browse files
Provide a draft specification using the Hindley-Milner algorithm.
Co-authored-by: Henrik Tidefelt <[email protected]>
1 parent 7524b11 commit 365663f

File tree

1 file changed

+336
-0
lines changed

1 file changed

+336
-0
lines changed

RationaleMCP/0027/HindleyMilner.md

+336
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,336 @@
1+
# Unit checking
2+
3+
Unit checking is the process of inferring the units of Modelica variables, and ensuring the variables are used consistently in expressions and equations.
4+
5+
It takes place after any evaluation of evaluable parameters and no evaluable parameters shall be evaluated during this process.
6+
(This must be true as long as tools allow the users to opt-out from unit-checking.)
7+
8+
It is performed by using a type inference algorithm, albeit adapted to work with units.
9+
The algorithm proposed here is based on the Hindley-Milner algorithm.
10+
At a high level the algorithm collects a set of constraints on the units of expressions.
11+
It extracts a set of substitutions from these constraints which once applied determine whether a variable has a unit and what that unit would be.
12+
Any inconsistency discovered during this process is a unit error, and any unit expression that contains remaining unknowns at the end of the process is considered unknown.
13+
14+
The document is organized as follows:
15+
* [Introduction of what is meant by the unit of a variable](#the-unit-of-a-modelica-variable)
16+
* [Description of the objects used in the inference algorithm](#unit-constraints-and-meta-expressions)
17+
* [Definition of unit equivalence and convertibility](#unit-equivalence-and-convertibility)
18+
* [Unit inference algorihtm](#unit-inference)
19+
* [Application to Modelica](#unit-constraints-of-a-modelica-model)
20+
* [Modelica operators on units](#modelica-operators-on-units)
21+
* [Extensions and limitations](#possible-extensions-and-current-limits)
22+
23+
## The unit of a Modelica variable
24+
25+
The unit of a Modelica variable is the unit associated with the variable after unit-checking has been performed, possibly remaining unknown.
26+
27+
Note that the unit of a variable is not the same concept as the `unit`-attribute of the variable, but in a unit-consistent model, the unit will be equal to the `unit`-attribute when present.
28+
29+
## Unit constraints and meta-expressions
30+
31+
Unit constraints are introduced in the form of an [equivalence](#equivalence) of two unit _meta-expressions_.
32+
After an evaluation stage, a unit constraint of the form _`u1` is equivalent to `u2`_ will be represented by the _single-sided form_ `u1*u2^-1`.
33+
34+
This section outlines the different constructs that comprise meta-expressions.
35+
36+
### Variables
37+
The unit of the Modelica variable `var` is represented by the unit variable `var.unit`.
38+
These are unknowns for the inference algorithm to determine.
39+
40+
### Literals
41+
42+
#### `well-formed` unit
43+
The `well-formed` units correspond to [Modelica unit expressions](https://specification.modelica.org/master/unit-expressions.html).
44+
45+
For the purposes of unit checking, they can be uniquely represented by their base unit factorization and scaling, up to the order of the base units.
46+
47+
(We discuss why we ignore unit offsets later.)
48+
49+
#### `empty` unit
50+
An expression with `empty` unit won't contribute to the unit of a parent expression, and will meet any constraints imposed on it.
51+
In practice, these are used to represent the unit of literal values.
52+
They provide a more flexible solution than considering any literal value to have unit `"1"`.
53+
They also provide a stricter solution than letting the unit of literals be inferred, which would effectively make literals wildcard from the perspective of the unit system.
54+
55+
#### `undefined` unit
56+
An expression with `undefined` unit is used to represent unspecified or error cases, and will meet any constraints imposed on it.
57+
58+
### Operators
59+
60+
#### *
61+
Consider the meta-expression `e1 * e2`.
62+
* If both `e1` and `e2` are `well-formed`, it evaluates to a `well-formed` unit resulting from multiplying `e1` by `e2`.
63+
* If one of them is `empty`, it evaluates to the other one. (In effect, the `empty` unit becomes unit `"1"`)
64+
* If one of them is `undefined`, it evaluates to `undefined`.
65+
66+
#### ^
67+
Consider the meta-expression `b ^ e`, where `e` is a rational.
68+
* If `b` is `well-formed`, it evaluates to a `well-formed` unit resulting from raising `b` to the power `e`.
69+
* If `b` is `empty`, it evaluates to `empty`.
70+
* If `b` is `undefined`, it evaluates to `undefined`.
71+
72+
#### der
73+
Consider the meta-expression `der(e)`.
74+
* In a constraint where a `well-formed` unit is present, it evaluates to `e * ("s" ^ -1)`.
75+
* If `e` is `empty`, it evaluates to `empty`.
76+
* If `e` is `undefined`, it evaluates to `undefined`.
77+
78+
This definition keeps `der(e)` in symbolic form in a model that does not contain any units.
79+
Otherwise, `der(x) = x` would lead to unit errors even if `x` has no declared unit.
80+
This also means that in `k*der(x) = -x`, the unit of `k` won't be inferred to be `"s"`.
81+
82+
### Evaluation
83+
Unit variables can't be found to be `empty` or `undefined` unit.
84+
This means that after evaluation, a unit meta-expression is either `undefined`, `empty` or does not contain any `undefined` or `empty` unit.
85+
86+
After evaluation a constraint will not contain both a well-formed unit and a `der` meta-expression.
87+
This means that all constraints can be simplified to have at most one well-formed unit.
88+
89+
## Unit equivalence and convertibility
90+
91+
### Equivalence
92+
Two well-formed units are said to be equivalent if both of the following are true:
93+
* Their base unit factorizations are equal.
94+
* Their base unit scalings are equal.
95+
96+
Examples:
97+
- `"N"` and `"m/s2"` have different base unit factorizations, they are not equivalent.
98+
- `"s"` and `"ms"` have different base unit scaling, they are not equivalent.
99+
- `"K"` and `"degC"` are equivalent.
100+
- `"kN"` and `"W.s/mm"` are equivalent.
101+
102+
It is a quality of implementation how equality of scalings are checked.
103+
104+
`empty` units and `undefined` units are always equivalent to each other, themselves and any `well-formed` units.
105+
106+
### Convertibility
107+
A well-formed unit `u1` is said to be convertible to another well-formed unit `u2` if:
108+
* Their base unit factorizations are equal.
109+
110+
Examples:
111+
- `"N"` is not convertible to `"m/s2"`, since they have different base unit factorizations.
112+
- `"s"` is convertible to `"ms"`, since they differ by base unit scaling.
113+
- `"K"` is convertible to `"degC"`.
114+
- `"kN"` is convertible to `"W.s/mm"`.
115+
116+
`empty` units and `undefined` units are never convertible to each other, themselves or any `well-formed` units.
117+
118+
## Unit inference
119+
120+
### Modified Hindley-Milner
121+
122+
The original Hindley-Milner algorithm is designed to infer types, and consists of building a set of substitutions that returns the type of any expression of a given program.
123+
Applying this algorithm to perform unit inference in Modelica comes with two important differences:
124+
* Not all variables must have a unit in a given model.
125+
* There are operations on meta-expressions, that is, not all constraints are of the form _`var1.unit` is equivalent to `var2.unit`_.
126+
127+
#### Solvability of constraints
128+
129+
After meta-expression evaluation and gathering of variables in the single-sided form, a constraint's solvability can be determined using:
130+
* A set `V` of unit variables present with non-zero exponent.
131+
* A set `D` of unit variables, present inside a `der` meta-expression.
132+
133+
With this information, a variable `var.unit` can be solved from a given constraint if:
134+
* `var.unit` is present in `V`, but not in `D`.
135+
136+
If a constraint has such a variable, the constraint is solvable.
137+
138+
#### Conditional constraints
139+
140+
One notion that doesn't exist in the original Hindley-Milner algorithm is that of a conditional constraint.
141+
In Modelica, errors that are in unused part of a simulation model should typically be ignored (e.g., removed conditional components).
142+
In the context of unit inference, some Modelica constructs generate constraints that are predicated on certain expressions having been evaluated or not.
143+
The constraint conditions are conjunctive, in other words, if even a single condition evaluates to `false` the constraint should be discarded.
144+
145+
#### The algorithm
146+
147+
The Hindley-Milner algorithm modified to perform unit inference in Modelica:
148+
* Traverse the flattened equation system and collect constraints before any evaluation has occured.
149+
* After evaluable parameters have been evaluated, and expressions have been reduced to values when possible, evaluate the constraints, as well as their conditions. (This propagates `empty` and `undefined` units to the top of the constraints.)
150+
* Discard any constraints of the form _`u` must be equivalent to empty unit_, or _`u` must be equivalent to undefined unit_, as they are trivially satisfied.
151+
* Discard any constraints having at least a condition that was evaluated to `false`.
152+
* Represent all constraints in their single-sided form.
153+
* Start with an empty set `S` of substitutions.
154+
* Loop:
155+
* Select a solvable constraint or one that can be checked, if no such constraint exists exit the loop.
156+
* If the constaint is in the form of just a `well-formed` unit, check that it is equivalent to `"1"`, it is an error otherwise.
157+
* If it can be solved for `var.unit`:
158+
* Replace the constraint by the substitution _`var.unit` => `u`_, where `var.unit` is not present in `u`.
159+
* Apply the substitution to the right-hand side of the substitutions already in `S`.
160+
* Add the substitution to `S`.
161+
* Apply the substitution to all remaining constraints.
162+
* The unit of a variable for which there is no substitution in `S` is unknown.
163+
* For a variable `var` that has a substitution `var.unit` => `u` in `S`:
164+
* If `u` is a `well-formed` unit, then this is the unit of `var`. (Given that no unit errors have been detected, a variable with a `unit`-attribute will get the given unit.)
165+
* Otherwise, the unit of `var` is unknown.
166+
167+
## Unit constraints of a Modelica model
168+
169+
The following rules describe the unit meta-expression of a given Modelica expression, where it makes sense.
170+
They also provide the constraints that arise from different Modelica constructs.
171+
172+
When it is specified that _`u1` must be equivalent to `u2`_, the corresponding constraint should be collected.
173+
174+
In this section, the unit meta-expression of expression `e` is called `e.unit` and conditions of conditional constraints are highlighted with [].
175+
176+
### Variables
177+
If `var` has a declared `unit`-attribute, `var.unit` must be equivalent to it.
178+
179+
`var.unit` must be equivalent to the unit of the following attributes if present:
180+
* `start`
181+
* `min`
182+
* `max`
183+
* `nominal`
184+
185+
(`var.unit` should be convertible to the `displayUnit`-attribute of `var` if it is present, but there is no corresponding constraints for the unit inference.)
186+
187+
Consider the expression `e` of the form `var`, where `var` is a Modelica variable.
188+
* If `var` doesn't have a declared unit attribute, and has constant component variablity, `e.unit` is the `empty` unit.
189+
* Otherwise, `e.unit` is `var.unit`.
190+
191+
### Literals
192+
`Real` literals have `empty` unit.
193+
194+
### Binary operations
195+
Consider an expression `e` of the form `e1 op e2`.
196+
197+
#### Additive operators
198+
* `e.unit` must be equivalent to `e1.unit`.
199+
* `e.unit` must be equivalent to `e2.unit`.
200+
201+
Note that this requires introducing an intermediate variable to represent `e.unit`.
202+
This allows unit propagation when one of the operand has `empty` unit.
203+
204+
#### Multiplication
205+
`e.unit` is `e1.unit * e2.unit`.
206+
207+
#### Division
208+
`e.unit` is `e1.unit * (e2.unit ^ -1)`.
209+
210+
#### Relational operators
211+
`e1.unit` must be equivalent to `e2.unit`
212+
213+
#### Power operator
214+
* `e.unit` must be equivalent to `e1.unit ^ e2`.
215+
* If [`e2` wasn't evaluated or is a `Real` expression], `e1.unit` must be equivalent to `"1"`.
216+
217+
Note that this requires introducing an intermediate variable to represent `e.unit`.
218+
Technically, `e2` is a Modelica expression rather than a rational number, which means that `e1.unit ^ e2` cannot be represented as a unit meta-expression.
219+
In practice, if the condition of the second constraint is verified, `e1.unit ^ e2` is `"1"` and if it isn't verified, then it can be represented as a meta-expression.
220+
221+
(If the exponent was not evaluated or is a `Real`, the unit of the base should be `"1"`.)
222+
(It is up for discussion, when `e2` is `Real`, whether `e2.unit` should be equivalent to `"1"`. It could be handled in a similar way as transcendental functions. See below.)
223+
224+
### Function calls
225+
Consider the expression `e` of the form `f(e1, e2, …, en)`.
226+
* If the output of the function has a declared unit, `e.unit` is the declared unit.
227+
* `e.unit` is `undefined` otherwise.
228+
229+
If function input `i` has a declared unit, `ei.unit` must be equivalent to this unit.
230+
231+
The natural generalization to functions with multiple outputs applies.
232+
233+
### Transcendental functions
234+
As examples of transcendental functions, consider `sin`, `sign`, and `atan2`.
235+
Other transcendental functions can be handled similarly.
236+
237+
Consider expression `e` of the form `sin(e1)`.
238+
* If `e1.unit` is `empty`, then so is `e.unit`.
239+
* If `e1.unit` is not `empty`, then it must be equivalent to `"1"` and `e.unit` is `"1"`.
240+
241+
Note that some of these constraints can only be processed once `e1.unit` has been determined.
242+
This avoids introducing a unit `"1"` in the inference algorithm.
243+
244+
Implementation note: The rule above can be implemented as making `e.unit` equivalent to `e1.unit`, and if `e1.unit` is `well-formed` after completed unit inference, then verify that it is equivalent to `"1"`.
245+
246+
Consider an expression `e` of the form `sign(e1)`.
247+
* If `e1.unit` is `empty`, then so is `e.unit`.
248+
* If `e1.unit` is not `empty`, then `e.unit` is `"1"`.
249+
250+
Consider an expression `e` of the form `atan2(e1, e2)`.
251+
* If `e1.unit` must be equivalent to `e2.unit`.
252+
* If `e1.unit` or `e2.unit` is `undefined`, then so is `e.unit`.
253+
* If `e1.unit` and `e2.unit` are `empty`, then so is `e.unit`.
254+
* If `e1.unit` or `e2.unit` is not `empty`, then `e.unit` is `"1"`.
255+
256+
Implementation note: The rule above can be implemented as making `e.unit` equivalent to both `e1.unit * e1.unit^-1` and `e2.unit * e2.unit^-1`.
257+
258+
### Operator der
259+
The unit of the expression `der(e)` is `der(e.unit)`.
260+
261+
### If-expressions
262+
Consider the expression `e` of the form `if cond then e1 else e2`.
263+
If [`cond` was not evaluated]:
264+
* `e.unit` must be equivalent to `e1.unit`.
265+
* `e.unit` must be equivalent to `e2.unit`.
266+
267+
If [`cond` was evaluated], `e.unit` must be equivalent to the unit of the selected branch.
268+
269+
Note that this requires introducing a intermediate variable to represent `e.unit`.
270+
271+
### Equations
272+
Both sides of binding equations, equality equations and connect equations must be unit equivalent.
273+
274+
## Modelica operators on units
275+
276+
The following operators facilitate writing unit-consistent models.
277+
278+
### `withUnit($value$, $unit$)`
279+
Creates an expression with unit `$unit$`, and value `$value$`.
280+
Argument $value$ needs to be a `Real` expression, with `empty` unit.
281+
282+
One application of this operator is to attach a unit to a literal.
283+
284+
### `withoutUnit($value$, $unit$)`
285+
Creates an expression with empty unit, whose value is the numerical value of `$value$` expressed in `$unit$`.
286+
Argument `$value$` needs to be a `Real` expression, with a well-formed unit.
287+
The unit of `$value$` must be convertible to `$unit$`.
288+
289+
### `inUnit($value$, $unit$)`
290+
Creates an expression with unit `$unit$`, whose value is the conversion of `$value$` to `$unit$`.
291+
Argument `$value$` needs to be a `Real` expression, with a well-formed unit.
292+
The unit of `$value$` must be convertible to `$unit$`.
293+
294+
## Possible extensions and current limits
295+
296+
### Units with offset
297+
The current proposal doesn't allow distinguishing `"K"` and `"degC"`.
298+
In order to do that one needs to keep track of whether a variable is meant to represent a difference or not.
299+
We have a proof of concept for such a feature, and the mechanism is similar and orthogonal to the one described here.
300+
301+
### Builtin operators and functions
302+
The current state of this proposal does not cover the unit checking of all Modelica builtin operators and functions.
303+
304+
### Arrays
305+
We don't try to specify how this would work on arrays, although the current proposal works straight-forwardly with element-wise operators.
306+
We are confident that it could be extended to matrix computations too, but haven't a proof of concept for that part.
307+
308+
### Constants
309+
This proposal takes a stance regarding how the unit of constants should be handled but we recognise that this is still very much up for discussion.
310+
It is also somewhat orthogonal to the rest of the proposal.
311+
312+
### Functions
313+
The Hindler-Milner algorithm provides a way to infer unit attributes of the inputs and outputs of functions.
314+
Concretely, it would imply computing the bundles of constraints from the body of each functions and adding them to the algorithm, after adequate substitutions, when processing a function call.
315+
316+
### Variable initialisation pattern
317+
A common pattern used in the MSL is the following:
318+
```
319+
model M
320+
Real d(unit = "s") = 12;
321+
Real v(unit = "m/s") = 0.1 / (3 * d);
322+
end M;
323+
```
324+
where a modeler knows how to express a variable with unit as a function of another, with help of literal values that should really carry a unit.
325+
The way to fix that, with this proposal, is to attach a unit to the literal values using `withUnit`.
326+
327+
### Syntax extension
328+
We could extend the Modelica language to add syntactic sugar for `withUnit`.
329+
The following variants have all been successfully test implemented:
330+
* `0.1["m"]`
331+
* `0.1 'm'` (or as `0.1'm'` to show more clearly that the quoted unit belongs to the literal)
332+
* `0.1."m"`
333+
334+
### unsafeUnit operator
335+
Another way to deal with the pattern described above would be to provide an unsafe operator that effectively drops any constraints generated in its subexpression.
336+
The whole expression would have `undefined` unit.

0 commit comments

Comments
 (0)