Implementation of the NTK in Colibri by achiefa · Pull Request #422 · HEP-PBSP/colibri

achiefa · 2025-12-09T10:10:47Z

This PR implements the Neural Tangent Kernel (NTK) in Colibri. The idea is to compute the NTK for any PDF model that is trained using the Monte Carlo replica method and gradient-based optimisers.

To compute the NTKs, the model parameters are stored on disk during training with a user-specified recording frequency. This will create a new directory called parameters in each replica folder, which will contain the set of parameters for each recorded epoch.

LucaMantani

A little comment

LucaMantani · 2025-12-09T10:26:19Z

Hi @achiefa , thanks for starting this. Just from a quick look, may I suggest to not have this write stuff directly, but rather add things in the GradientDescentResult? This way the writing is delegated to other dedicated functions and you don't need to modify much here. Similarly for the MonteCarloFit class.

Hi @LucaMantani, thanks for your comment. Indeed, we have considered this option, which I agree has a more solid design principle. However, I was worried that storing the parameters for all recorded epochs could yield memory issues during training. If instead we use a buffer that is saved on disk and freed at the end of each epoch, then we avoid any potential memory issue. Maybe this is not a problem at all, and we can simply store all parameters in a big array and then add it to GradientDescentResult.

Just to quantify the problem: For a neural network with 763 parameters (float64), a single array is about 0.01 MB. This is then multiplied by the number of epochs for which we want to save the parameters. For instance, if we have 100 epochs, this adds up to ~1MB for one replica. Again, probably we can afford this in favour of a better code design. What do you think?

I think 1 MB is nothing, we load in memory several Gb due to the data and FK tables. Even if one had a model with 1000 parameters, saving it 1000 times would be 8 Mb. So I would say memory is far from being an issue?

I agree. Let's put it in GradientDescentResult then.

achiefa · 2026-01-23T15:05:16Z

I had to change this directory from colibri/results to NNDF/results because the former is not created when installing Colibri. Is this a bug or a leftover from the previous implementation? @LucaMantani @comane

…ugh the api

…ibri-n3fit#12)

…o work with dataframes

codacy-production · 2026-04-01T10:37:45Z

Up to standards ✅

🟢 Issues 3 high · 1 medium · 15 minor

Results:
19 new issues

Category Results

UnusedCode 1 medium

ErrorProne 2 high

Security 1 high

CodeStyle 15 minor

View in Codacy

🟢 Metrics 256 complexity · 6 duplication

Metric Results

Complexity 256

Duplication 6

View in Codacy

_{TIP This summary will be updated as you push new changes. Give us feedback}

achiefa requested review from comane, ecole41 and vschutze-alt December 9, 2025 10:10

achiefa self-assigned this Dec 9, 2025

LucaMantani reviewed Dec 9, 2025

View reviewed changes

achiefa and others added 16 commits January 19, 2026 15:57

Save parameters of the PDF model to disk

25b2bc8

First implementation of the ntk routine

5926a45

Add example runcard

232ae72

restore example runcard

ac19ea0

Store parameters in GradientDescentResult

658da29

Add new example runcard

959123f

enable ntk computation from a separate dedicated runcard

d12ec10

Apply black formatting

042f1b1

add example runcard to compute ntk

168cd71

add more options to the runcard, add plotting function

75df249

Documentation

0cd0214

Documentation tree

7123ce0

populate docs

7ad5e46

remove runcards from example, as they are in the docs

e007004

populate docs, plots in pdf format

6aedfb3

add ntk analysis plots and docs

8327586

achiefa force-pushed the ntk branch from f88b339 to 8327586 Compare January 19, 2026 15:57

NTK analysis + plot utilities

a5a9a58

achiefa commented Jan 23, 2026

View reviewed changes

achiefa added 6 commits January 26, 2026 08:52

Refactoring ntk analysis

6cfdc5e

mark max_epochs as Optional

8c0226c

Implementing plot utilities for NTK + eigenvectors + refactoring

e92df47

Adding example notebook; shows how to use the NTK funcionalities thro…

d6c6a77

…ugh the api

Workaround to use nnpdf model oob from n3fit + correct misspelling

00b7cc6

Allow additional kwargs (for layer selector in n3fit see HEP-PBSP/col…

72e2635

…ibri-n3fit#12)

achiefa added 4 commits March 23, 2026 14:36

Add temporary fix for compatibility with nnpdf fits

7c6161b

Use frozensets instead of plain dictionaries

3138708

Convert frozenset to dict inside function

1a551d1

Add getters and make flavor ordering more explicit + allow NTKutils t…

8ca4175

…o work with dataframes

achiefa added 5 commits April 1, 2026 17:41

Remove unused imports

df4a23a

Update __rmatmul__ to match with __matmul__ in NTKStats

20b16d5

Correct spectrum indexing + serialization NTKGrid

bcac320

Add reshape method to NTKStats

c5edcff

Add ntk module

6d8011f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of the NTK in Colibri#422

Implementation of the NTK in Colibri#422
achiefa wants to merge 32 commits intomainfrom
ntk

achiefa commented Dec 9, 2025 •

edited

Loading

Uh oh!

LucaMantani left a comment

Uh oh!

LucaMantani Dec 9, 2025

Uh oh!

achiefa Dec 9, 2025

Uh oh!

LucaMantani Dec 9, 2025

Uh oh!

achiefa Dec 9, 2025

Uh oh!

achiefa Jan 23, 2026

Uh oh!

codacy-production Bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

achiefa commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LucaMantani left a comment

Choose a reason for hiding this comment

Uh oh!

LucaMantani Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

achiefa Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

LucaMantani Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

achiefa Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

achiefa Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

achiefa commented Dec 9, 2025 •

edited

Loading

codacy-production Bot commented Apr 1, 2026 •

edited

Loading