HEP-PBSP · vschutze-alt · Apr 28, 2026 · Mar 23, 2026 · Mar 24, 2026 · Mar 24, 2026
diff --git a/colibri/doc/sphinx/source/available-models/index.rst b/colibri/doc/sphinx/source/available-models/index.rst
@@ -13,4 +13,5 @@ To implement your own model, follow :ref:`this tutorial <in_les_houches>`.
    :maxdepth: 1
 
    linear-model
-   grid-pdf-model
+   grid-pdf-model
+   n3fit
diff --git a/colibri/doc/sphinx/source/available-models/n3fit.rst b/colibri/doc/sphinx/source/available-models/n3fit.rst
@@ -0,0 +1,85 @@
+.. _n3fit-model:
+
+===========
+n3fit Model
+===========
+
+**Model Respository:** https://github.com/HEP-PBSP/colibri-n3fit/tree/main/colibri_n3fit .
+
+This model is based on the `n3fit` model used in the NNPDF framework :cite:`NNPDF:2021uiq`,
+which is open source and available here: https://github.com/NNPDF/nnpdf .
+
+
+What is this model for
+----------------------
+
+This model is mostly intended to run :ref:`Monte Carlo fits <running_mc_replica>`,
+and can be used to compare NNPDF's n3fit model to other models/parametrisations or
+fitting methodologies.
+
+Model description
+-----------------
+
+This model parametriseses PDFs by the following functional form:
+
+.. math::
+    :label: eq:nnpdf-parametrisaton
+
+    f_{i}(x) =  A_i \mathrm{NN}(x)_{j} * x^{1-\alpha_{j}} * (1-x)^{\beta_{j}},
+
+where the PDFs are defined in the evolution basis, as described in Ref. :cite:alp:`NNPDF:2021uiq`,
+and :math:`\mathrm{NN}(x)_{j}` is the output of a Neural Network. 
+
+The free parameters of this model are are:
+
+* The preprocessing parameters, :math:`\alpha` and :math:`\beta`, which are sampled for each replica from uniform distributions, as defined by ``FLAV_INFO_NNPDF40`` in ``colibri-n3fit/colibri_n3fit/utils.py``. These values are fixed during training.
+* The NN weights, which are found by minimising the :math:`\chi^2` (see :ref:`this section <likelihood>`).
+
+Settings that are specific to this model are described :ref:`below <n3fit-model-settings>`.
+
+
+How to use this model
+---------------------
+
+Before running a fit, you will have to clone the repository:
+
+.. code-block:: bash
+
+    git clone https://github.com/HEP-PBSP/colibri-n3fit/tree/main/colibri_n3fit ,
+
+and install the dependencies and executable:
+
+.. code-block:: bash
+
+    conda env create -f environment.yml
+    conda activate example-colibri-n3fit
+    pip install -e .
+
+You can then run an fit with the available example runcard:
+
+.. code-block:: bash
+
+    colibri_n3fit colibri_n3fit/runcards/example_pdf_fit_monte_carlo.yaml -r 1
+
+To analyse the results of this fit, you can follow the instructions given in
+:ref:`this section <mc_fit_folders>`.
+
+
+.. _n3fit-model-settings:
+
+Model specific settings
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The neural network architecture can be defined in the runcard through the following parameters:
+
+.. code-block::
+
+    nodes: [25,20,8]  # The number of nodes in each layer of the neural network. The output should be set to the number of PDF flavours
+    activations: [tanh, tanh, linear]
+    nnseed: 945709987       # Seed used for the sampling of preprocessing factors
+
+- ``nodes``: the number of nodes in each hidden layer. The last layer should have a number of nodes equal to the number of PDF flavours being fitted. All flavours in the evolution basis are fitted by default in this model, but you can choose to fit a subset of these with ``flavour_mapping`` if and only if you are running a :ref:`closure test <lh-closure-test>`.
+- ``activations``: the activation function to be used in each hidden layer, e.g. ``tanh`` or ``linear``. You can read about other options in the `Keras documentation <https://keras.io/api/layers/activations/>`_ .
+- ``nnseed``: The random seed used to sample the preprocessing factors :math:`\alpha` and :math:`\beta` from uniform distributions for each replica. 
+
+You can read more about other relevant settings, such as Monte Carlo or Training settings in the :ref:`section on how to run Monte Carlo fits <running_mc_replica>`.
diff --git a/colibri/doc/sphinx/source/theory/likelihood.rst b/colibri/doc/sphinx/source/theory/likelihood.rst
@@ -15,7 +15,7 @@ and a likelihood function, :math:`\mathcal{l}(\mathbf{D} | \boldsymbol{\theta})`
 In this section, we will discuss the form of the likelihood function and its
 implementation in Colibri. Note that a complementary discussion of the likelihood
 function can be found in the
-`NNPDF documentation <https://docs.nnpdf.science/figuresofmerit/index.html>`_
+`NNPDF documentation <https://docs.nnpdf.science/figuresofmerit/index.html>`_ .
 
 In general, it is more convenient to work in terms of the *log-likelihood*,
 :math:`\mathcal{L}(\mathbf{D} | \boldsymbol{\theta}) = \log (\mathcal{l}(\mathbf{D} | \boldsymbol{\theta}))`.

diff --git a/colibri/doc/sphinx/source/tutorials/running_fits/running_hessian.rst b/colibri/doc/sphinx/source/tutorials/running_fits/running_hessian.rst
@@ -109,7 +109,10 @@ The following runcard can be used to run a Hessian fit with Colibri.
         #         transition_steps: 10000
 
     # Training settings
-    max_epochs: 30000
+    use_gen_t0: True             # Whether the t0 covariance is used to generated pseudodata.
+    max_epochs: 30000       # The max number of epochs in Monte Carlo training.
+    patience: 1000          # The number of epochs to wait for an improvement in the validation loss before stopping the training
+
 
     param_initialiser_settings:               
         type: uniform
@@ -134,7 +137,7 @@ The following runcard can be used to run a Hessian fit with Colibri.
 
 Note that the Hessian fit uses the same ``param_initialiser_settings`` as a
 Monte Carlo fit, and so any of the initialisation options discussed in the
-:ref:`Monte Carlo fit tutorial <running_mc_replica>` can be used for these
+:ref:`Monte Carlo fit tutorial <param-initialiser-settings>` can be used for these
 settings (i.e. gaussian initialisation, global bounds for all parameters ...).
 
 ``hessian_settings``

diff --git a/colibri/doc/sphinx/source/tutorials/running_fits/running_mc_replica.rst b/colibri/doc/sphinx/source/tutorials/running_fits/running_mc_replica.rst
@@ -66,6 +66,7 @@ executable.
 
     theoryid: 40000000                     # The theory from which the predictions are drawn.
     use_cuts: internal                     # The kinematic cuts to be applied to the data.
+    mcseed: 519562661                      # Seed used for the production of pseudodata
 
     #####################
     # Loss function specs
@@ -109,6 +110,12 @@ executable.
 
     # Training settings
     max_epochs: 300                        # The max number of epochs in Monte Carlo training.
+    patience: 1000  # The number of epochs to wait for an improvement in the validation loss before stopping the training
+
+    # Monte Carlo settings
+    use_gen_t0: True                       # Whether the t0 covariance is used to generated pseudodata.
+    positive_pseudodata: False             # If set to True, the pseudodata will be resampled until all pseudodata points are positive
+
     mc_validation_fraction: 0.2            # The fraction of the data used for validation in Monte Carlo training.
 
     param_initialiser_settings:               # The initialiser for Monte Carlo training.
@@ -139,6 +146,8 @@ of the Optax optimizers and settings, which you can read more about
 Learning schedulers are also supported, and you can find the available options
 `here <https://optax.readthedocs.io/en/latest/api/optimizer_schedules.html#>`_.
 
+.. _param-initialiser-settings:
+
 ``param_initialiser_settings``
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -196,6 +205,7 @@ can do so:
     means or standard deviations, default values of **0.0** and **1.0**
     will be used respectively.
 
+
 Using data batching
 ^^^^^^^^^^^^^^^^^^^
 In Monte Carlo replica fits, it is possible to use data batching during training.

diff --git a/colibri/mc_utils.py b/colibri/mc_utils.py
@@ -28,7 +28,7 @@ def mc_pseudodata(
     central_covmat_index,
     replica_index,
     trval_seed,
-    mcseed,
+    mcseed=519562661,
     shuffle_indices=True,
     positive_pseudodata=False,
     mc_validation_fraction=0.2,