Adjust the BSATN doc to fit reality better (#216)

Centril · web-flow · commit 05a112c5ceb9 · 2025-03-12T14:08:47.000+01:00
adjust BSATN doc to fit reality better
diff --git a/docs/bsatn.md b/docs/bsatn.md
@@ -1,4 +1,4 @@
-# SATN Binary Format (BSATN)
+# Binary SATN Format (BSATN)
 
 The Spacetime Algebraic Type Notation binary (BSATN) format defines
 how Spacetime `AlgebraicValue`s and friends are encoded as byte strings.
@@ -29,16 +29,26 @@ To do this, we use inductive definitions, and define the following notation:
 | [`AlgebraicValue`](#algebraicvalue) | A value of any type.                                                  |
 | [`SumValue`](#sumvalue)             | A value of a sum type, i.e. an enum or tagged union.                  |
 | [`ProductValue`](#productvalue)     | A value of a product type, i.e. a struct or tuple.                    |
-| [`BuiltinValue`](#builtinvalue)     | A value of a builtin type, including numbers, booleans and sequences. |
 
 ### `AlgebraicValue`
 
 The BSATN encoding of an `AlgebraicValue` defers to the encoding of each variant:
 
 ```fsharp
-bsatn(AlgebraicValue) = bsatn(SumValue) | bsatn(ProductValue) | bsatn(BuiltinValue)
+bsatn(AlgebraicValue)
+    = bsatn(SumValue)
+    | bsatn(ProductValue)
+    | bsatn(ArrayValue)
+    | bsatn(String)
+    | bsatn(Bool)
+    | bsatn(U8) | bsatn(U16) | bsatn(U32) | bsatn(U64) | bsatn(U128) | bsatn(U256)
+    | bsatn(I8) | bsatn(I16) | bsatn(I32) | bsatn(I64) | bsatn(I128) | bsatn(I256)
+    | bsatn(F32) | bsatn(F64)
 ```
 
+Algebraic values include sums, products, arrays, strings, and primitives types.
+The primitive types include booleans, unsigned and signed integers up to 256-bits, and floats, both single and double precision.
+
 ### `SumValue`
 
 An instance of a sum type, i.e. an enum or tagged union.
@@ -60,44 +70,58 @@ bsatn(elems) = bsatn(elem_0) ++ .. ++ bsatn(elem_n)
 
 Field names are not encoded.
 
-### `BuiltinValue`
+### `ArrayValue`
+
+The encoding of an `ArrayValue` is:
+
+```
+bsatn(ArrayValue(a))
+    = bsatn(len(a) as u32)
+   ++ bsatn(normalize(a)_0)
+   ++ ..
+   ++ bsatn(normalize(a)_n)
+```
+
+where `normalize(a)` for `a: ArrayValue` converts `a` to a list of `AlgebraicValue`s.
 
-An instance of a buil-in type.
-Built-in types include booleans, integers, floats, strings and arrays.
-The BSATN encoding of `BuiltinValue`s defers to the encoding of each variant:
+### Strings
 
+For strings, the encoding is defined as:
 ```fsharp
-bsatn(BuiltinValue)
-    = bsatn(Bool)
-    | bsatn(U8) | bsatn(U16) | bsatn(U32) | bsatn(U64) | bsatn(U128)
-    | bsatn(I8) | bsatn(I16) | bsatn(I32) | bsatn(I64) | bsatn(I128)
-    | bsatn(F32) | bsatn(F64)
-    | bsatn(String)
-    | bsatn(Array)
+bsatn(String(s)) = bsatn(len(s) as u32) ++ bsatn(utf8_to_bytes(s))
+```
+That is, the BSATN encoding is the concatenation of
+- the bsatn of the string's length as a `u32` integer byte
+- the utf8 representation of the string as a byte array
 
-bsatn(Bool(b)) = bsatn(b as u8)
+### Primitives
+
+For the primitive variants of `AlgebraicValue`, the BSATN encodings are:s
+
+```fsharp
+bsatn(Bool(false)) = [0]
+bsatn(Bool(true)) = [1]
 bsatn(U8(x)) = [x]
 bsatn(U16(x: u16)) = to_little_endian_bytes(x)
 bsatn(U32(x: u32)) = to_little_endian_bytes(x)
 bsatn(U64(x: u64)) = to_little_endian_bytes(x)
 bsatn(U128(x: u128)) = to_little_endian_bytes(x)
+bsatn(U256(x: u256)) = to_little_endian_bytes(x)
 bsatn(I8(x: i8)) = to_little_endian_bytes(x)
 bsatn(I16(x: i16)) = to_little_endian_bytes(x)
 bsatn(I32(x: i32)) = to_little_endian_bytes(x)
 bsatn(I64(x: i64)) = to_little_endian_bytes(x)
 bsatn(I128(x: i128)) = to_little_endian_bytes(x)
+bsatn(I256(x: i256)) = to_little_endian_bytes(x)
 bsatn(F32(x: f32)) = bsatn(f32_to_raw_bits(x)) // lossless conversion
 bsatn(F64(x: f64)) = bsatn(f64_to_raw_bits(x)) // lossless conversion
 bsatn(String(s)) = bsatn(len(s) as u32) ++ bsatn(bytes(s))
-bsatn(Array(a)) = bsatn(len(a) as u32)
-               ++ bsatn(normalize(a)_0) ++ .. ++ bsatn(normalize(a)_n)
 ```
 
 Where
 
-- `f32_to_raw_bits(x)` is the raw transmute of `x: f32` to `u32`
-- `f64_to_raw_bits(x)` is the raw transmute of `x: f64` to `u64`
-- `normalize(a)` for `a: ArrayValue` converts `a` to a list of `AlgebraicValue`s
+- `f32_to_raw_bits(x)` extracts the raw bits of `x: f32` to `u32`
+- `f64_to_raw_bits(x)` extracts the raw bits of `x: f64` to `u64`
 
 ## Types