-
Notifications
You must be signed in to change notification settings - Fork 113
Design of the HNix code base
HNix uses the most clean prelude it can, relude
with our Utils
on top is used.
-
Prelude
has built-in:-
Text
support.-
Show
is polymorphic on return type (aka useshow
to produceText
). - Please, if
String
can not be avoided - usetoString
fromString
- they indicate whereString
is used. - Currently (
2021-11-06
),String
newtypePath
& its set of functions are used to indicate filepaths. This simple move made the type information & code much understandable.
-
-
NonEmpty
list support. - eliminators (
bool
,whenTrue
(when
),whenFalse
(unless
),maybe
,whenJust
,either
) are used widely. -
pass = pure ()
) is replaced with more useful & intuitive namestub = pure mempty
.
-
Project strives to gradually standardize on the GHC2021
language.
-
Currently (
2021-11-06
) language by default is:OverloadedStrings , DeriveGeneric , DeriveDataTypeable , DeriveFunctor , DeriveFoldable , DeriveTraversable , DeriveLift , FlexibleContexts , FlexibleInstances , ScopedTypeVariables , StandaloneDeriving , TypeApplications , TypeSynonymInstances , InstanceSigs , MultiParamTypeClasses , TupleSections , LambdaCase , BangPatterns , ViewPatterns
Which are set project-wide through the Cabal description.
Module connections give the idea of the project design & structure:
Shows the main real semantic dependencies between modules. Update: (2021-07-23)
Update: (2021-07-23)
Update: (2021-02-02)
Update: (2021-02-02)
Welcome to the HNix code! You may notice some strange things as you venture into this realm, so this document is meant to prepare you, dear reader, for the secrets of the labyrinth as we've designed them.
The first thing to note is that HNix was primarily designed so that Haskell authors could use it to craft custom tooling around the Nix ecosystem. Thus, it was never fully intended for just the end user. As a result, we use a great deal of abstraction so that these enterprising Haskell authors may add new behavior at various points: within the type hierarchy, the value representation, and the evaluator.
To this end, you'll see a lot of type variables floating around, almost everywhere. These provide many of the "injection points" mentioned above. There is a strict convention followed for the naming of these variables, the lexicon for which is stated here.
-- | Atoms are values that evaluate to themselves.
-- In other words - this is a constructors that are literals in Nix.
-- This means that
-- they appear in both the parsed AST (in the form of literals) and
-- the evaluated form as themselves.
-- Once HNix parsed or evaluated into atom - that is a literal
-- further after, for any further evaluation it is in all cases stays
-- constantly itself.
-- "atom", Ancient Greek \( atomos \) - "indivisible" particle,
-- indivisible expression.
data NAtom
-- | An URI like @https://example.com@.
= NURI Text
-- | An integer. The c nix implementation currently only supports
-- integers that fit in the range of 'Int64'.
| NInt Integer
-- | A floating point number
| NFloat Float
-- | Booleans. @false@ or @true@.
| NBool Bool
-- | Null values. There's only one of this variant: @null@.
| NNull
deriving (Eq, Ord, Generic, Typeable, Data, Show, Read, NFData, Hashable)
HNix uses recursion schemes in the core part of the project to express and administer the recursive nature of the Nix language in the most effective implementation.
Recursion schemes DSL engine is a hassle to implement, but after it is done, the further handling is quite easy.
*F
type is F
/f
(Functor) for its particular F-algebra.
Literally, the full description of what is F-algebra and its parts, translated into Haskell:
The recursion-schemes perspective is to view recursive types as fixed points of functors.
(quoting Oleg GrenRus)
The core idea ... is that instead of writing recursive functions on a recursive datatype, we prefer to write non-recursive functions on a related, non-recursive datatype we call the "base functor".
(quoting Gabriel Gonzales)
Actually, even in Haskell recursion is not completely first-class because the compiler does a terrible job of optimizing recursive code. This is why F-algebras and F-coalgebras are pervasive in high-performance Haskell libraries like vector, because they transform recursive code to non-recursive code, and the compiler does an amazing job of optimizing non-recursive code.
2018-04: Johns "Program Reduction: A Win for Recursion Schemes during HNix creation.
So the core of the Nix DSL type system:
-
NValueF a
is the "base functor data type" for Nix language as a whole. . -
NValue' = (NValueF (NValue))
which during evaluation runtime becomes: -
NValue = Free NValue'
- Nix types.
Since:
-
NValueF (NValue)
produces: NValueF (NValue) -> NValue' -> NValue
- Transitivity:
NValueF (NValue) -> NValue
- Which is a F-algebra:
-
NValue
- Carrier of the F-algebra. -
NValueF (NValue)
- Base functor of F-algebra. This F-algebra is "the initial" algebra in the project.
-
NValueF
and NValue'
are used only to implement the NValue
, and all other project works with NValue
exclusively.
NValue
can be met everywhere in execution code: primOps, builtins, effect, language conversion.
-- | 'NValue' is the most reduced form of a 'NExpr' after the evaluation is
-- completed. 's' is related to the type of errors that might occur during
-- construction or use of a value.
data NValueF p m r
= NVConstantF NAtom
-- | A string has a value and a context, which can be used to record what a
-- string has been build from
| NVStrF NixString
| NVPathF FilePath
| NVListF [r]
| NVSetF (AttrSet r) (AttrSet SourcePos)
| NVClosureF (Params ()) (p -> m r)
-- ^ A function is a closed set of parameters representing the "call
-- signature", used at application time to check the type of arguments
-- passed to the function. Since it supports default values which may
-- depend on other values within the final argument set, this
-- dependency is represented as a set of pending evaluations. The
-- arguments are finally normalized into a set which is passed to the
-- function.
--
-- Note that 'm r' is being used here because effectively a function
-- and its set of default arguments is "never fully evaluated". This
-- enforces in the type that it must be re-evaluated for each call.
| NVBuiltinF String (p -> m r)
-- ^ A builtin function is itself already in normal form. Also, it may
-- or may not choose to evaluate its argument in the production of a
-- result.
-- | At the time of constructor, the expected arguments to closures are values
-- that may contain thunks. The type of such thunks are fixed at that time.
newtype NValue' t f m a =
NValue
{
-- | Applying F-algebra functor (@NValueF@) to the F-algebra carrier (forming the \( F(A) \)).
_nValue :: f (NValueF (NValue t f m) m a)
}
-- | 'NValue t f m' is
-- a value in head normal form (it means only the tip of it has been
-- evaluated to the normal form, while the rest of it is in lazy
-- not evaluated form (thunk), this known as WHNF).
--
-- An action 'm (NValue t f m)' is a pending evaluation that
-- has yet to be performed.
--
-- An 't' is either:
-- * a pending evaluation.
-- * a value in head normal form.
--
-- The 'Free' structure is used here to represent the possibility that
-- Nix language allows cycles that may appear during normalization.
type NValue t f m = Free (NValue' t f m) t
So it is real-deal elegant F-algebra. It implements Nix language in the most direct way from Haskell, Nix becomes a DSL of Haskell from now on. With Haskell type-checking system. Moreover, F-algebra is very convenient implementation - it is an endlessly conveniently transformable thing through just several basic operations, as would be seen further. The transformable thing that is fully recursive in its nature, but from now on that does not bother us anymore, since it is under control.
There are NValue* t f m
and such, where:
t
is the type of thunks. It turns out that HNix doesn't actually need to
know how thunks are represented, at all. It only needs to know that the
interface can be honored: pending action that yield values may be turned into
thunks, and thunks can later be forced into values.
f
is the type of a comonadic and applicative functor that gets injected at
every level of a value's recursive structure. In the standard evaluation
scheme, this is used to provide "provenance" information to track which
expression context a value originated from (e.g., this 10 came from that
expression "5 + 5" over in this file, here).
m
is the "monad of evaluation", which must support at least the features
required by the evaluator. The fact that the user can evaluate in his own base
monad makes it possible to create custom builtins that make use of arbitrary
effects.
v
is the type of values, which is almost always going to be NValue t f m
,
though it could be NValueNF t f m
, the type of normal form values. Very few
points in the code are generic over both.
Base funtor data types (NValueF a
) also call "unfixed".
Morphim that is transformations between 2 F-algebras called recursion scheme.
So recursion schemes are just simply different operations on initial F-algebra. Why it is said as just any operations? Because any f a
(of kind >1) - is a functor, so any F a b c d..
to infinite complexity - is also a functor. And the only remained case is kind 1 type. And surprise - it is isomorphic to the Id a
, and that is a functor, so from 1-type F-algebra also can be constructed: Id a -> a
, and as regular F-algebra they are the first-class citizen for recursion schemes, there is no longer any difference for us.
Recursion scheme - transformation takes the initial(term) F-algebra and returns terminal(term) F-algebra. Folds, unfolds, traverses are just F-algebra transformations that are particular recursion scheme type. For folds, unfolds, traverses we need to provide the implementation ("algebra" *term) of how particularly fold, unfold, traverse, and we do for the recursion scheme just the same.
The possible operations: Cheat-sheet. Put in the other words: Table.
For example in parsers - are so called "unfixed" types for/from recursion schemes (common recursive operations), these types are a generalization of the recursive data types into open (non-explicitly recursive, but allowed through polymorphism) form. So Fix
ed data type versions - are (equivalent to/are) regular data types to work with (aka list), and the *F
- are their generalizations (one layer open for possible recursion (through self-application) - but without explicit recursion). *F
is a type that both can become recursive structure by applying Fix
to it, and as a single layer of the structure - internal value/data can be extracted from it in the initial form. For example - from the list cell the value extracts in the initial form, and it is impossible to extract initial data from the list (recursive structure) without referring to list cells (without referring to building blocks - the recursive structure can transform only into the recursive structure - which never terminates into receiving initial data), so addressing the cell in a list - is a unFix
of a list to the outermost cell, and from the cell, the data can be extracted.
Fixed types - are the actual presentation of the data types.
The project currently uses relude
, which is a straightforward helper prelude that currently helps to reduce the default Haskell technical debts in the code.
The project enables OverloadedStrings
by default.
The project enables ApplicativeDo
by default, so maybe think about wrapping the (=<<)
& (<=<)
into do
block, if there may some optimization be possible.
Essentially, ADI allows for evaluation what recursion schemes allow for representation: allows threading layers through existing structure, only in this case through behavior. Roughly: ADI allows 1:1 map evaluation of Nix language to Haskell language evaluation, inherit & passthough the nature of GHC Haskell evaluation into HNix Nix evaluation.
Some of the history is at the end of The Haskell Cast: Episode 13 - John Wiegley on Categories and Compilers.