First attempt to add conditional NFs to jester infrastructure by haukekoehn · Pull Request #125 · nuclear-multimessenger-astronomy/jester

haukekoehn · 2026-04-10T15:48:48Z

Introduce extra functions to accommodate conditional flows in jester.

Summary by CodeRabbit

New Features
- Added conditional model support with per-parameter standardization, plus an exported conditional model type.
- Training and sampling now accept conditional inputs and include conditional dimensions in plots and outputs.
Configuration
- New optional config field to name conditional parameters; names appear in plots and are saved to metadata.
Bug Fixes
- Improved likelihood and sampling handling for standardized conditional data.

coderabbitai · 2026-04-10T15:49:16Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds conditional-flow support: exports ConditionalFlow; introduces cond_parameter_names in config; implements ConditionalFlow with conditional standardization, directory loader, conditional log_prob and sample; extends training pipeline to accept, validate, standardize, and persist conditional samples and stats.

Changes

Cohort / File(s)	Summary
Module exports `jesterTOV/inference/flows/__init__.py`	Imported and exported `ConditionalFlow` in `__all__`, making it part of the public API.
Configuration `jesterTOV/inference/flows/config.py`	Added `cond_parameter_names: list[str]
Conditional flow implementation `jesterTOV/inference/flows/flow.py`	Added `ConditionalFlow(Flow)` with constructor storing `flow`, `metadata`, and `flow_kwargs`; `standardize_cond_data()` to standardize conditionals (z-score or min-max); `from_directory()` loader; conditional `log_prob(x, y)` that standardizes inputs, calls underlying `flow.log_prob`, and applies Jacobian correction using conditional stats; conditional `sample(key, shape, y)` that standardizes conditionals, samples from underlying flow, and destandardizes outputs (implementation currently contains an ordering bug where `y_std` is referenced before assignment).
Training pipeline & utilities `jesterTOV/inference/flows/train_flow.py`	Added `standardize_data()` helper (z-score/min-max) returning transformed data and stats; `train_flow()` accepts `np.ndarray

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CF as ConditionalFlow
    participant Flow as UnderlyingFlow
    participant JC as JacobianCorrection

    User->>CF: log_prob(x, y)
    CF->>CF: standardize_input(x) -> x_std
    CF->>CF: standardize_cond_data(y) -> y_std
    CF->>Flow: flow.log_prob(x_std, y_std)
    Flow-->>CF: log_prob_std
    CF->>JC: apply correction using cond stats (std or range)
    JC-->>CF: log_prob_corrected
    CF-->>User: return log_prob_corrected

    User->>CF: sample(key, shape, y)
    CF->>CF: standardize_cond_data(y) -> y_std
    CF->>Flow: flow.sample(key, shape, condition=y_std)
    Flow-->>CF: samples_std
    CF->>CF: destandardize_output(samples_std) -> samples
    CF-->>User: return samples

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I nibbled stats and named each field,
I shuffled scales until they kneeled,
I taught the flow to heed a cue,
then hopped away with samples new. ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'First attempt to add conditional NFs to jester infrastructure' directly and specifically describes the main change: introducing conditional normalizing flow support to the jester infrastructure.
Docstring Coverage	✅ Passed	Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch conditional_flows

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

haukekoehn · 2026-04-10T15:49:27Z

The code runs, but it's not thoroughly tested yet. I will experiment a little more with it and then we can see how to proceed.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

jesterTOV/inference/flows/flow.py (1)
361-364: Render the new conditional-density equations as Sphinx math.

The plain-text log p(x|y) formulas will not format consistently in generated docs.

As per coding guidelines, "All mathematical expressions in docstrings must use Sphinx/reStructuredText formatting with :math: role for inline math and .. math:: directive for display equations."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/flow.py` around lines 361 - 364, Update the
docstring describing the Jacobian correction in flow.py to use Sphinx math
formatting: replace plain-text formulas like "log p(x|y) = log p(x_std|y_std) -
sum(log(std))" with inline :math: roles and display equations using the ..
math:: directive; specifically render the three cases (Z-score, Min-max, None)
as proper Sphinx math blocks so they appear correctly in generated docs and keep
the surrounding explanatory text intact (look for the Jacobian correction
section in the docstring near the comment block in flow.py).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jesterTOV/inference/flows/config.py`:
- Line 91: Validate conditional config fields immediately when constructing the
configuration: in the config class where cond_dim and cond_parameter_names are
declared, add a sanity check that raises ValueError for invalid combinations —
require cond_dim to be None or an int >= 1 (reject 0 and negatives), require
cond_parameter_names to be either None or a list[str], forbid
cond_parameter_names when cond_dim is None or <=0, and require
len(cond_parameter_names) == cond_dim when both are provided; remove reliance on
the weaker presence-only check in train_flow.py and let train_flow.py assume
config validation has already enforced consistency.

In `@jesterTOV/inference/flows/flow.py`:
- Around line 414-416: The code calls self.flow.sample with the raw conditional
y but training standardized conditionals when config.standardize is true; update
the method so that if self.config.standardize (or equivalent flag) is enabled
you first transform y using the same routine used during training (e.g., call
the existing standardization helper such as self.standardize_input(y) or
self.standardize_condition(y) — if that helper doesn't exist, implement the same
standardization logic used during fit), then pass the standardized conditional
into self.flow.sample(key, shape, condition=standardized_y), and finally keep
the existing destandardize_output(samples) step.
- Around line 314-328: Fix standardize_cond_data in ConditionalFlow: for the
"zscore" branch divide by self.cond_data_std (not cond_data_mean); for the
"minmax" branch guard against zero ranges by using a safe_range =
jnp.where(self.cond_data_range == 0, 1, self.cond_data_range) and divide by
that; and add an explicit "none" case that returns the input data unchanged so
it doesn't reference min/max attributes that weren't initialized. Update the
standardize_cond_data method accordingly (referencing standardize_cond_data,
self.cond_data_mean, self.cond_data_std, self.cond_data_min,
self.cond_data_range).

In `@jesterTOV/inference/flows/train_flow.py`:
- Line 567: The corner-plot logic assumes `data` is an ndarray and calls
`sample()` without conditions, but earlier `data = (data, cond_samples)` for
conditional runs; update the plotting branch in the training flow to detect when
`data` is a tuple (e.g. check `isinstance(data, tuple)` or presence of
`cond_samples`) and either (a) derive an unconditional array for plotting or (b)
skip/disable the corner plot when conditions are present, or if plotting is
desired call the model's `sample(condition=...)`/`sample(cond_samples)` variant
so sampling respects the condition; ensure the check prevents the current
fallback warning by handling `cond_samples` explicitly and referencing the
`data` tuple, `cond_samples`, and the `sample()` call sites in the flow.
- Around line 651-658: The code reads cond_data_statistics unconditionally when
populating metadata in train_flow.py, which causes an UnboundLocalError when
cond_dim is None; update the block that adds conditional data stats to first
check that cond_data_statistics (or cond_dim) is defined/True before accessing
it—e.g., wrap the existing if config.standardization_method == "zscore": ...
else: ... inside a guard like if cond_data_statistics is not None (or if
cond_dim is not None and cond_data_statistics) so metadata population only runs
when cond_data_statistics exists (references: cond_data_statistics,
config.standardization_method, metadata).

---

Nitpick comments:
In `@jesterTOV/inference/flows/flow.py`:
- Around line 361-364: Update the docstring describing the Jacobian correction
in flow.py to use Sphinx math formatting: replace plain-text formulas like "log
p(x|y) = log p(x_std|y_std) - sum(log(std))" with inline :math: roles and
display equations using the .. math:: directive; specifically render the three
cases (Z-score, Min-max, None) as proper Sphinx math blocks so they appear
correctly in generated docs and keep the surrounding explanatory text intact
(look for the Jacobian correction section in the docstring near the comment
block in flow.py).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9d16684a-19f2-4060-9497-f45ddfd31655

📥 Commits

Reviewing files that changed from the base of the PR and between 174baa9 and f618d38.

📒 Files selected for processing (4)

jesterTOV/inference/flows/__init__.py
jesterTOV/inference/flows/config.py
jesterTOV/inference/flows/flow.py
jesterTOV/inference/flows/train_flow.py

coderabbitai · 2026-04-10T16:07:22Z

    flow_layers: int = 1
    invert: bool = True
    cond_dim: int | None = None
+    cond_parameter_names: list[str] | None = None


⚠️ Potential issue | 🟠 Major

Fail fast on inconsistent conditional configs.

This addition exposes cond_parameter_names, but the model still accepts cond_dim=0, negative cond_dim, cond_parameter_names without cond_dim, and lists whose length does not match cond_dim. The runtime guard in jesterTOV/inference/flows/train_flow.py only checks presence, so some bad configs are silently ignored while others fail much later inside flow creation/training.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@jesterTOV/inference/flows/config.py` at line 91, Validate conditional config fields immediately when constructing the configuration: in the config class where cond_dim and cond_parameter_names are declared, add a sanity check that raises ValueError for invalid combinations — require cond_dim to be None or an int >= 1 (reject 0 and negatives), require cond_parameter_names to be either None or a list[str], forbid cond_parameter_names when cond_dim is None or <=0, and require len(cond_parameter_names) == cond_dim when both are provided; remove reliance on the weaker presence-only check in train_flow.py and let train_flow.py assume config validation has already enforced consistency.

coderabbitai · 2026-04-10T16:07:22Z

+        if self.standardization_method=="zscore":
+            self.cond_data_mean = jnp.array(metadata["cond_data_mean"])
+            self.cond_data_std = jnp.array(metadata["cond_data_std"])
+
+        elif self.standardization_method=="minmax":
+            self.cond_data_min = jnp.array(metadata["cond_data_min"])
+            self.cond_data_max = jnp.array(metadata["cond_data_max"])
+            self.cond_data_range = self.cond_data_max - self.cond_data_min
+
+    def standardize_cond_data(self, data: Array) -> Array:
+
+        if self.standardization_method == "zscore":
+            return (data - self.cond_data_mean) / self.cond_data_mean
+        else:
+            return (data - self.cond_data_min) / self.cond_data_range


⚠️ Potential issue | 🔴 Critical

Fix the conditional standardization math.

The z-score branch divides by cond_data_mean instead of cond_data_std, the "none" case falls through to min-max attributes that were never initialized, and the min-max path does not recreate the zero-range guard from training. ConditionalFlow.log_prob() will therefore be wrong for standardized models and can crash for unstandardized ones.

🛠️ Suggested fix

if self.standardization_method=="zscore": self.cond_data_mean = jnp.array(metadata["cond_data_mean"]) self.cond_data_std = jnp.array(metadata["cond_data_std"]) + self.cond_data_std = jnp.where( + self.cond_data_std == 0, 1.0, self.cond_data_std + ) elif self.standardization_method=="minmax": self.cond_data_min = jnp.array(metadata["cond_data_min"]) self.cond_data_max = jnp.array(metadata["cond_data_max"]) self.cond_data_range = self.cond_data_max - self.cond_data_min + self.cond_data_range = jnp.where( + self.cond_data_range == 0, 1.0, self.cond_data_range + ) def standardize_cond_data(self, data: Array) -> Array: if self.standardization_method == "zscore": - return (data - self.cond_data_mean) / self.cond_data_mean - else: + return (data - self.cond_data_mean) / self.cond_data_std + if self.standardization_method == "minmax": return (data - self.cond_data_min) / self.cond_data_range + return data

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@jesterTOV/inference/flows/flow.py` around lines 314 - 328, Fix standardize_cond_data in ConditionalFlow: for the "zscore" branch divide by self.cond_data_std (not cond_data_mean); for the "minmax" branch guard against zero ranges by using a safe_range = jnp.where(self.cond_data_range == 0, 1, self.cond_data_range) and divide by that; and add an explicit "none" case that returns the input data unchanged so it doesn't reference min/max attributes that weren't initialized. Update the standardize_cond_data method accordingly (referencing standardize_cond_data, self.cond_data_mean, self.cond_data_std, self.cond_data_min, self.cond_data_range).

coderabbitai · 2026-04-10T16:07:22Z

+        samples = self.flow.sample(key, shape, condition=y)
+
+        samples = self.destandardize_output(samples)


⚠️ Potential issue | 🔴 Critical

sample() is conditioning on the wrong scale.

Training standardizes conditional inputs before fitting when config.standardize is enabled, but this method passes raw y straight into the flow. With the default config, generated samples are conditioned on mismatched inputs.

🛠️ Suggested fix

- samples = self.flow.sample(key, shape, condition=y) + y_std = self.standardize_cond_data(y) + samples = self.flow.sample(key, shape, condition=y_std)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@jesterTOV/inference/flows/flow.py` around lines 414 - 416, The code calls self.flow.sample with the raw conditional y but training standardized conditionals when config.standardize is true; update the method so that if self.config.standardize (or equivalent flag) is enabled you first transform y using the same routine used during training (e.g., call the existing standardization helper such as self.standardize_input(y) or self.standardize_condition(y) — if that helper doesn't exist, implement the same standardization logic used during fit), then pass the standardized conditional into self.flow.sample(key, shape, condition=standardized_y), and finally keep the existing destandardize_output(samples) step.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

jesterTOV/inference/flows/train_flow.py (1)

687-696: ⚠️ Potential issue | 🟠 Major

Inverse standardization incorrectly applied to conditional columns.

When config.cond_dim is set, flow_samples_np contains both primary samples and appended conditional columns (line 680). The inverse standardization at lines 690-696 applies data_statistics (computed for primary data) to all columns, corrupting the already-original-scale conditional values.

Either inverse-transform before concatenation or slice out the conditional columns before applying the inverse.

🛠️ Suggested fix (transform before concatenation)

             if config.cond_dim:
                 flow_samples = trained_flow.sample(sample_key, (cond_samples.shape[0],), condition=cond_samples)
-                flow_samples_np = np.hstack((flow_samples, original_data[:, -config.cond_dim:]))
+                flow_samples_np = np.array(flow_samples)
+                # Inverse transform primary samples before concatenation
+                if config.standardize and data_statistics is not None:
+                    if config.standardization_method == "zscore":
+                        flow_samples_np = inverse_standardize_data_zscore(
+                            flow_samples_np, data_statistics
+                        )
+                    else:
+                        flow_samples_np = inverse_standardize_data_minmax(
+                            flow_samples_np, data_statistics
+                        )
+                flow_samples_np = np.hstack((flow_samples_np, original_cond_samples))
                 labels = [*config.parameter_names, *config.cond_parameter_names]
             else:
                 flow_samples = trained_flow.sample(sample_key, (n_plot_samples,))
                 flow_samples_np = np.array(flow_samples)
-
-            # Inverse transform samples if data was standardized
-            if config.standardize and data_statistics is not None:
-                if config.standardization_method == "zscore":
-                    flow_samples_np = inverse_standardize_data_zscore(
-                        flow_samples_np, data_statistics
-                    )
-                else:  # minmax
-                    flow_samples_np = inverse_standardize_data_minmax(
-                        flow_samples_np, data_statistics
-                    )
+                # Inverse transform samples if data was standardized
+                if config.standardize and data_statistics is not None:
+                    if config.standardization_method == "zscore":
+                        flow_samples_np = inverse_standardize_data_zscore(
+                            flow_samples_np, data_statistics
+                        )
+                    else:  # minmax
+                        flow_samples_np = inverse_standardize_data_minmax(
+                            flow_samples_np, data_statistics
+                        )
                 labels = parameter_names

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/train_flow.py` around lines 687 - 696,
flow_samples_np is being inverse-standardized across all columns even when
config.cond_dim > 0, which applies data_statistics (computed for primary
features) to appended conditional columns and corrupts them; fix by either
inverse-transforming the primary-sample portion before concatenating
conditionals or by slicing out the last config.cond_dim columns from
flow_samples_np, applying inverse_standardize_data_zscore /
inverse_standardize_data_minmax to only the primary slice (using
data_statistics), and then re-attaching the conditional columns; locate this
logic around flow_samples_np, config.cond_dim, inverse_standardize_data_zscore,
inverse_standardize_data_minmax and data_statistics and ensure conditional
columns are excluded from the inverse transform.

♻️ Duplicate comments (4)

jesterTOV/inference/flows/flow.py (3)

323-328: ⚠️ Potential issue | 🔴 Critical

standardize_cond_data crashes when standardization_method="none".

When standardization is disabled, the else branch tries to access cond_data_min and cond_data_range, which are never initialized for the "none" case. This will raise AttributeError at runtime.

🛠️ Suggested fix

     def standardize_cond_data(self, data: Array) -> Array:

         if self.standardization_method == "zscore":
             return (data - self.cond_data_mean) / self.cond_data_std
-        else:
+        elif self.standardization_method == "minmax":
             return (data - self.cond_data_min) / self.cond_data_range
+        else:
+            # No standardization - identity transform
+            return data

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/flow.py` around lines 323 - 328, The method
standardize_cond_data currently assumes either "zscore" or min-max and will
access self.cond_data_min/self.cond_data_range even when
standardization_method=="none"; update standardize_cond_data to explicitly
handle the "none" case (or unknown methods) by returning data unchanged (or
raising a clear ValueError for unsupported methods), and avoid accessing
cond_data_min/cond_data_range unless they are initialized; reference the method
name standardize_cond_data and the attributes cond_data_mean, cond_data_std,
cond_data_min, cond_data_range when making the change.

414-415: ⚠️ Potential issue | 🔴 Critical

y_std referenced before assignment; raw y passed to flow.

Line 414 uses y_std on the right-hand side before it is defined, causing a NameError. Additionally, line 415 passes the raw y to self.flow.sample instead of the standardized version, producing samples conditioned on the wrong scale.

🛠️ Suggested fix

-        y_std = self.standardize_cond_data(y_std)
-        samples = self.flow.sample(key, shape, condition=y)
+        y_std = self.standardize_cond_data(y)
+        samples = self.flow.sample(key, shape, condition=y_std)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/flow.py` around lines 414 - 415, The code
references y_std before it is created and passes raw y into the sampler; fix by
first assigning y_std = self.standardize_cond_data(y) and then call
self.flow.sample(key, shape, condition=y_std) so the sampler receives the
standardized conditioning; look for usages in the method containing
standardize_cond_data and self.flow.sample to update the order and the condition
argument (y -> y_std).

314-322: ⚠️ Potential issue | 🟠 Major

Missing zero-division guards for conditional statistics.

The z-score and min-max branches load conditional statistics but do not apply the same zero-division protection that the parent Flow.__init__ applies to primary data statistics. If any conditional feature has zero variance (z-score) or zero range (min-max), standardize_cond_data will produce inf/nan.

🛠️ Suggested fix

         if self.standardization_method=="zscore":
             self.cond_data_mean = jnp.array(metadata["cond_data_mean"])
             self.cond_data_std = jnp.array(metadata["cond_data_std"])
+            # Avoid division by zero
+            self.cond_data_std = jnp.where(self.cond_data_std == 0, 1.0, self.cond_data_std)

         elif self.standardization_method=="minmax":
             self.cond_data_min = jnp.array(metadata["cond_data_min"])
             self.cond_data_max = jnp.array(metadata["cond_data_max"])
             self.cond_data_range = self.cond_data_max - self.cond_data_min
+            # Avoid division by zero
+            self.cond_data_range = jnp.where(self.cond_data_range == 0, 1.0, self.cond_data_range)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/flow.py` around lines 314 - 322, The
conditional-statistics loading branches for standardization_method ("zscore" and
"minmax") can produce divide-by-zero when a conditional feature has zero
variance/range; update the zscore branch to set self.cond_data_std =
jnp.where(self.cond_data_std == 0, 1, self.cond_data_std) (or add a small
epsilon) after loading metadata["cond_data_std"], and update the minmax branch
to compute self.cond_data_range = self.cond_data_max - self.cond_data_min and
then set self.cond_data_range = jnp.where(self.cond_data_range == 0, 1,
self.cond_data_range) (or add epsilon) so that standardize_cond_data uses safe
denominators; refer to standardization_method, cond_data_std, cond_data_range,
and standardize_cond_data when making the change.

jesterTOV/inference/flows/train_flow.py (1)

652-659: ⚠️ Potential issue | 🔴 Critical

cond_data_statistics may be None when accessed.

When config.cond_dim is set but config.standardize is False, cond_data_statistics remains None (initialized at line 560). The code then tries to access cond_data_statistics["mean"] or cond_data_statistics["min"], raising a TypeError.

🛠️ Suggested fix

     # Add conditional data statistics to metadata if standardization was used
-    if config.cond_dim:
+    if config.cond_dim and cond_data_statistics is not None:
         if config.standardization_method == "zscore":
             metadata["cond_data_mean"] = cond_data_statistics["mean"].tolist()
             metadata["cond_data_std"] = cond_data_statistics["std"].tolist()
         else: 
             metadata["cond_data_min"] = cond_data_statistics["min"].tolist()
             metadata["cond_data_max"] = cond_data_statistics["max"].tolist()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/train_flow.py` around lines 652 - 659, The code
reads cond_data_statistics when config.cond_dim is true but cond_data_statistics
can be None; update the block that adds conditional stats to metadata to first
ensure cond_data_statistics is not None (or compute it from cond_data if
available) before accessing keys. Specifically, in the block guarded by
config.cond_dim and config.standardization_method, check cond_data_statistics is
truthy and raise a clear error or skip adding stats if it's None; reference the
cond_data_statistics variable, config.standardization_method, and the metadata
dict when making the change so you don't accidentally access
["mean"/"std"/"min"/"max"] on None.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jesterTOV/inference/flows/flow.py`:
- Around line 331-334: The docstring for the classmethod from_directory in
ConditionalFlow has a typo ("conditinal"); update the docstring text to read
"Load a trained conditional flow from a directory." so the wording is correct
and consistent with the ConditionalFlow class name and purpose.
- Around line 374-377: The docstring example for flow.log_prob is missing the
conditional argument y; update the example to call flow.log_prob(x, y) (i.e.,
replace flow.log_prob(data) with something like flow.log_prob(x, y) where x is
the input array and y is the corresponding conditional array), ensuring the
example uses the correct variable names from the method signature (log_prob(x,
y)) and matching shapes/types for x and y in the illustrated call.

In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 198-207: The logging message "Data bounds saved for inverse
transformation" is inside the for-loop that iterates over parameter_names,
causing it to be emitted once per parameter; move that logger.info call out of
the loop so it runs once after the loop that logs per-parameter ranges in the
minmax branch (the block using standardize_data_minmax, data, data_statistics,
parameter_names, and logger). Ensure the loop only logs the per-parameter ranges
and then, immediately after the loop completes, call logger.info("Data bounds
saved for inverse transformation") once.
- Around line 679-680: The conditional call to trained_flow.sample is missing
the required shape argument; update the call to trained_flow.sample(sample_key,
shape, condition=cond_samples) by passing the desired sample shape (e.g.,
matching the number of samples and latent/event dim) — derive shape from
existing variables like cond_samples.shape[0] (batch size) and the expected
output dimensionality (use config or the flow's output dim) so that flow_samples
has the correct shape before hstack with original_data[:, -config.cond_dim:].

---

Outside diff comments:
In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 687-696: flow_samples_np is being inverse-standardized across all
columns even when config.cond_dim > 0, which applies data_statistics (computed
for primary features) to appended conditional columns and corrupts them; fix by
either inverse-transforming the primary-sample portion before concatenating
conditionals or by slicing out the last config.cond_dim columns from
flow_samples_np, applying inverse_standardize_data_zscore /
inverse_standardize_data_minmax to only the primary slice (using
data_statistics), and then re-attaching the conditional columns; locate this
logic around flow_samples_np, config.cond_dim, inverse_standardize_data_zscore,
inverse_standardize_data_minmax and data_statistics and ensure conditional
columns are excluded from the inverse transform.

---

Duplicate comments:
In `@jesterTOV/inference/flows/flow.py`:
- Around line 323-328: The method standardize_cond_data currently assumes either
"zscore" or min-max and will access self.cond_data_min/self.cond_data_range even
when standardization_method=="none"; update standardize_cond_data to explicitly
handle the "none" case (or unknown methods) by returning data unchanged (or
raising a clear ValueError for unsupported methods), and avoid accessing
cond_data_min/cond_data_range unless they are initialized; reference the method
name standardize_cond_data and the attributes cond_data_mean, cond_data_std,
cond_data_min, cond_data_range when making the change.
- Around line 414-415: The code references y_std before it is created and passes
raw y into the sampler; fix by first assigning y_std =
self.standardize_cond_data(y) and then call self.flow.sample(key, shape,
condition=y_std) so the sampler receives the standardized conditioning; look for
usages in the method containing standardize_cond_data and self.flow.sample to
update the order and the condition argument (y -> y_std).
- Around line 314-322: The conditional-statistics loading branches for
standardization_method ("zscore" and "minmax") can produce divide-by-zero when a
conditional feature has zero variance/range; update the zscore branch to set
self.cond_data_std = jnp.where(self.cond_data_std == 0, 1, self.cond_data_std)
(or add a small epsilon) after loading metadata["cond_data_std"], and update the
minmax branch to compute self.cond_data_range = self.cond_data_max -
self.cond_data_min and then set self.cond_data_range =
jnp.where(self.cond_data_range == 0, 1, self.cond_data_range) (or add epsilon)
so that standardize_cond_data uses safe denominators; refer to
standardization_method, cond_data_std, cond_data_range, and
standardize_cond_data when making the change.

In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 652-659: The code reads cond_data_statistics when config.cond_dim
is true but cond_data_statistics can be None; update the block that adds
conditional stats to metadata to first ensure cond_data_statistics is not None
(or compute it from cond_data if available) before accessing keys. Specifically,
in the block guarded by config.cond_dim and config.standardization_method, check
cond_data_statistics is truthy and raise a clear error or skip adding stats if
it's None; reference the cond_data_statistics variable,
config.standardization_method, and the metadata dict when making the change so
you don't accidentally access ["mean"/"std"/"min"/"max"] on None.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2815381c-fe1c-4f7b-acf0-afa64357dbe4

📥 Commits

Reviewing files that changed from the base of the PR and between f618d38 and fd31ffd.

📒 Files selected for processing (2)

jesterTOV/inference/flows/flow.py
jesterTOV/inference/flows/train_flow.py

coderabbitai · 2026-04-12T22:35:27Z

+    else:  # minmax
+        logger.info("Standardizing data using min-max [0, 1] scaling...")
+        data, data_statistics = standardize_data_minmax(data)
+        logger.info("Standardized data ranges:")
+        for i, name in enumerate(parameter_names):
+            logger.info(
+                f"  {name}: [{data[:, i].min():.3f}, {data[:, i].max():.3f}]"
+            )
+            logger.info("Data bounds saved for inverse transformation")
+


⚠️ Potential issue | 🟡 Minor

Logging statement inside loop causes repeated output.

Line 206 is indented inside the for loop, so "Data bounds saved for inverse transformation" will be logged once per parameter instead of once total. Compare with the z-score branch where the equivalent message (line 197) is correctly outside the loop.

🛠️ Suggested fix

else: # minmax logger.info("Standardizing data using min-max [0, 1] scaling...") data, data_statistics = standardize_data_minmax(data) logger.info("Standardized data ranges:") for i, name in enumerate(parameter_names): logger.info( f" {name}: [{data[:, i].min():.3f}, {data[:, i].max():.3f}]" ) - logger.info("Data bounds saved for inverse transformation") + logger.info("Data bounds saved for inverse transformation")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@jesterTOV/inference/flows/train_flow.py` around lines 198 - 207, The logging message "Data bounds saved for inverse transformation" is inside the for-loop that iterates over parameter_names, causing it to be emitted once per parameter; move that logger.info call out of the loop so it runs once after the loop that logs per-parameter ranges in the minmax branch (the block using standardize_data_minmax, data, data_statistics, parameter_names, and logger). Ensure the loop only logs the per-parameter ranges and then, immediately after the loop completes, call logger.info("Data bounds saved for inverse transformation") once.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (3)

jesterTOV/inference/flows/flow.py (2)

416-417: ⚠️ Potential issue | 🔴 Critical

Use the standardized condition in sample().

Line 416 reads y_std before it exists, and Line 417 still sends raw y into the flow. So this method currently fails immediately, and it would still condition on the wrong scale after fixing the NameError.

Suggested fix

-        y_std = self.standardize_cond_data(y_std)
-        samples = self.flow.sample(key, shape, condition=y)
+        y_std = self.standardize_cond_data(y)
+        samples = self.flow.sample(key, shape, condition=y_std)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/flow.py` around lines 416 - 417, The code calls
self.standardize_cond_data but then passes the unstandardized y into
self.flow.sample and also reads y_std before it's defined; fix by assigning the
result of self.standardize_cond_data(y) to y_std (or rename the variable
consistently) and then pass that standardized variable into self.flow.sample as
the condition argument; update the block around standardize_cond_data and
flow.sample in the same method so sample(...) receives y_std instead of y.

314-328: ⚠️ Potential issue | 🔴 Critical

Handle the non-standardized conditional path explicitly.

standardize_cond_data() still assumes min/max stats exist whenever the method is not "zscore". With standardize=False, ConditionalFlow never initializes cond_data_min / cond_data_range, so the first conditional log_prob() or sample() call will crash. The min-max path also rebuilds cond_data_range without the zero-range guard used during training, which can reintroduce inf/nan for constant conditional features.

Suggested fix

         if self.standardization_method=="zscore":
             self.cond_data_mean = jnp.array(metadata["cond_data_mean"])
             self.cond_data_std = jnp.array(metadata["cond_data_std"])
+            self.cond_data_std = jnp.where(self.cond_data_std == 0, 1.0, self.cond_data_std)

         elif self.standardization_method=="minmax":
             self.cond_data_min = jnp.array(metadata["cond_data_min"])
             self.cond_data_max = jnp.array(metadata["cond_data_max"])
             self.cond_data_range = self.cond_data_max - self.cond_data_min
+            self.cond_data_range = jnp.where(
+                self.cond_data_range == 0, 1.0, self.cond_data_range
+            )

     def standardize_cond_data(self, data: Array) -> Array:

         if self.standardization_method == "zscore":
             return (data - self.cond_data_mean) / self.cond_data_std
-        else:
+        if self.standardization_method == "minmax":
             return (data - self.cond_data_min) / self.cond_data_range
+        return data

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/flow.py` around lines 314 - 328,
standardize_cond_data currently assumes a minmax path when
standardization_method != "zscore", causing crashes if ConditionalFlow was
created with standardize=False and also reintroducing divide-by-zero for
constant features; update the metadata-loading and standardize_cond_data logic
to explicitly handle a non-standardized case (e.g., standardization_method is
None or "none") by returning data unchanged in standardize_cond_data, and for
the "minmax" branch ensure cond_data_min/cond_data_max/cond_data_range are
initialized from metadata (or computed) and clamp cond_data_range with a small
eps (or replace zeros with 1) to avoid division by zero—refer to symbols
standardization_method, standardize_cond_data, cond_data_min, cond_data_max,
cond_data_range, cond_data_mean, cond_data_std to locate and modify the code.

jesterTOV/inference/flows/train_flow.py (1)

652-659: ⚠️ Potential issue | 🔴 Critical

Only persist conditional stats when they were actually computed.

This block runs for every conditional training job, but cond_data_statistics is only populated when config.standardize is true. With cond_dim set and standardization disabled, saving the model crashes here on cond_data_statistics["..."].

Suggested fix

-    if config.cond_dim:
+    if config.cond_dim and config.standardize and cond_data_statistics is not None:
         if config.standardization_method == "zscore":
             metadata["cond_data_mean"] = cond_data_statistics["mean"].tolist()
             metadata["cond_data_std"] = cond_data_statistics["std"].tolist()
         else: 
             metadata["cond_data_min"] = cond_data_statistics["min"].tolist()
             metadata["cond_data_max"] = cond_data_statistics["max"].tolist()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/train_flow.py` around lines 652 - 659, The code
writes conditional stats unconditionally when config.cond_dim is set but
cond_data_statistics is only created when standardization was enabled; guard
these writes by checking the standardization flag and existence of
cond_data_statistics (e.g., only proceed if config.standardize is True and
cond_data_statistics is not None), then branch on config.standardization_method
("zscore") to persist mean/std or min/max into metadata using the existing keys
(cond_data_mean, cond_data_std, cond_data_min, cond_data_max) so you never index
cond_data_statistics when it wasn't computed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jesterTOV/inference/flows/flow.py`:
- Around line 354-365: Update the ConditionalFlow.log_prob() docstring to use a
raw string (r""") and convert the probability change-of-variable lines into
Sphinx math: use :math: for inline symbols and a ``.. math::`` block for the
display equations showing Z-score, Min-max, and None corrections (e.g. .. math::
\\log p(x\\mid y) = \\log p(x_{std}\\mid y_{std}) - \\sum \\log(\\mathrm{std}),
etc.), ensuring each formula is valid LaTeX inside the math directive and the
prose references those math expressions via :math: where needed.

In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 679-694: The sampled primary parameters are being flattened with
reshape(-1, 1) which collapses multi-dimensional samples and breaks alignment
with labels; instead keep flow_samples_np as an (n_samples, dim) matrix (i.e.,
do not force a single-column reshape) and horizontally concatenate it with the
matching conditional rows from original_data (use original_data[:
flow_samples_np.shape[0], -config.cond_dim:] to ensure row counts match). Update
the hstack call that builds flow_samples_np and ensure labels =
[*config.parameter_names, *config.cond_parameter_names] still matches the
resulting column count.
- Around line 545-550: The check using len(config.cond_parameter_names) can
raise TypeError if cond_parameter_names is None; update the validation in
train_flow.py (around the block that references
FlowTrainingConfig.cond_parameter_names and config.cond_dim) to first guard that
config.cond_parameter_names is not None/empty and if config.cond_dim > 0 then
raise a ValueError when names are missing, and otherwise if it is present
validate len(config.cond_parameter_names) == config.cond_dim and raise the
existing ValueError with the same message; ensure you reference
FlowTrainingConfig.cond_parameter_names and config.cond_dim in the new guard
logic.

---

Duplicate comments:
In `@jesterTOV/inference/flows/flow.py`:
- Around line 416-417: The code calls self.standardize_cond_data but then passes
the unstandardized y into self.flow.sample and also reads y_std before it's
defined; fix by assigning the result of self.standardize_cond_data(y) to y_std
(or rename the variable consistently) and then pass that standardized variable
into self.flow.sample as the condition argument; update the block around
standardize_cond_data and flow.sample in the same method so sample(...) receives
y_std instead of y.
- Around line 314-328: standardize_cond_data currently assumes a minmax path
when standardization_method != "zscore", causing crashes if ConditionalFlow was
created with standardize=False and also reintroducing divide-by-zero for
constant features; update the metadata-loading and standardize_cond_data logic
to explicitly handle a non-standardized case (e.g., standardization_method is
None or "none") by returning data unchanged in standardize_cond_data, and for
the "minmax" branch ensure cond_data_min/cond_data_max/cond_data_range are
initialized from metadata (or computed) and clamp cond_data_range with a small
eps (or replace zeros with 1) to avoid division by zero—refer to symbols
standardization_method, standardize_cond_data, cond_data_min, cond_data_max,
cond_data_range, cond_data_mean, cond_data_std to locate and modify the code.

In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 652-659: The code writes conditional stats unconditionally when
config.cond_dim is set but cond_data_statistics is only created when
standardization was enabled; guard these writes by checking the standardization
flag and existence of cond_data_statistics (e.g., only proceed if
config.standardize is True and cond_data_statistics is not None), then branch on
config.standardization_method ("zscore") to persist mean/std or min/max into
metadata using the existing keys (cond_data_mean, cond_data_std, cond_data_min,
cond_data_max) so you never index cond_data_statistics when it wasn't computed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 42186b54-53b8-4748-b4df-e16239bd559c

📥 Commits

Reviewing files that changed from the base of the PR and between fd31ffd and 41640ae.

📒 Files selected for processing (2)

jesterTOV/inference/flows/flow.py
jesterTOV/inference/flows/train_flow.py

coderabbitai · 2026-04-13T09:41:50Z

+        """
+        Evaluate log probability of samples under the flow.
+
+        If standardization was used, input data is automatically standardized
+        before evaluation and Jacobian correction is applied. If not, operations
+        are identity (no-op).
+
+        The Jacobian correction accounts for the change of variables:
+        - Z-score: log p(x|y) = log p(x_std|y_std) - sum(log(std))
+        - Min-max: log p(x|y) = log p(x_std|y_std) - sum(log(max - min))
+        - None: log p(x|y) = log p(x_std|y_std) (no correction)
+


⚠️ Potential issue | 🟡 Minor

Format the new probability equations as Sphinx math.

The ConditionalFlow.log_prob() docstring adds math-heavy text, but it is plain prose right now. Please switch it to Sphinx math markup so the docs render consistently.

As per coding guidelines "All mathematical expressions in docstrings must use Sphinx/reStructuredText formatting with :math: role for inline math and .. math:: directive for display equations" and "Use raw strings (r""") for docstrings containing LaTeX to avoid Python escape sequence warnings".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@jesterTOV/inference/flows/flow.py` around lines 354 - 365, Update the ConditionalFlow.log_prob() docstring to use a raw string (r""") and convert the probability change-of-variable lines into Sphinx math: use :math: for inline symbols and a ``.. math::`` block for the display equations showing Z-score, Min-max, and None corrections (e.g. .. math:: \\log p(x\\mid y) = \\log p(x_{std}\\mid y_{std}) - \\sum \\log(\\mathrm{std}), etc.), ensuring each formula is valid LaTeX inside the math directive and the prose references those math expressions via :math: where needed.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

jesterTOV/inference/flows/train_flow.py (1)

170-208: ⚠️ Potential issue | 🟡 Minor

Mutable default argument and logging placement issues.

Mutable default argument: Using [] as default for parameter_names is a Python anti-pattern that can cause unexpected behavior if the list is mutated. Use None instead.
Logging inside loop (Line 206): The "Data bounds saved" message is inside the for loop, causing it to log once per parameter instead of once total.

🛠️ Suggested fix

 def standardize_data(
         data: np.ndarray,
         standardization_method: str,
-        parameter_names: list[str] = []
+        parameter_names: list[str] | None = None
 ) -> Tuple[np.ndarray, Dict[str, np.ndarray]]:
     """
     Standardize data based on the selected standardiation method.
     
     Args:
         data: Array of shape (n_samples, n_features)
         standardization_method: str
             Which method to standardize with. 
             Can either be 'zscore' for zero mean and unit std,
             otherwise min-max scaling will be applied.
         parameter_names:
             Parameter names for which the rescaled data range should be printed.
             Defaults to [].
+    
+    Returns:
+        standardized_data: Data standardized according to method
+        data_statistics: Dictionary with standardization parameters
     """
+    if parameter_names is None:
+        parameter_names = []

     if standardization_method == "zscore":
         logger.info("Standardizing data using z-score (mean=0, std=1)...")
         data, data_statistics = standardize_data_zscore(data)
         logger.info("Standardized data statistics:")
         for i, name in enumerate(parameter_names):
             logger.info(
                 f"  {name}: mean={data[:, i].mean():.3f}, std={data[:, i].std():.3f}"
             )
         logger.info("Data mean and std saved for inverse transformation")
     else:  # minmax
         logger.info("Standardizing data using min-max [0, 1] scaling...")
         data, data_statistics = standardize_data_minmax(data)
         logger.info("Standardized data ranges:")
         for i, name in enumerate(parameter_names):
             logger.info(
                 f"  {name}: [{data[:, i].min():.3f}, {data[:, i].max():.3f}]"
             )
-            logger.info("Data bounds saved for inverse transformation")
+        logger.info("Data bounds saved for inverse transformation")

     return data, data_statistics

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/train_flow.py` around lines 170 - 208, The function
standardize_data uses a mutable default for parameter_names and logs the "Data
bounds saved for inverse transformation" inside the per-parameter loop; change
the signature to use parameter_names: Optional[list[str]] = None and at the
start of standardize_data set parameter_names = parameter_names or [] so callers
get a fresh list, and move the logger.info("Data bounds saved for inverse
transformation") call out of the for i, name in enumerate(parameter_names) loop
(place it after the loop, analogous to the zscore branch) so it logs once per
operation; keep references to data_statistics, standardize_data_zscore, and
standardize_data_minmax intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 652-660: The metadata population block assumes
cond_data_statistics is set; guard against it by only reading
cond_data_statistics when it is not None (or when config.standardize is True)
before accessing keys like cond_data_statistics["mean"]/["std"]/["min"]/["max"];
update the conditional around config.cond_dim to also require
cond_data_statistics is not None (or config.standardize) and only then assign
metadata["cond_data_mean"/"cond_data_std"/"cond_data_min"/"cond_data_max"],
otherwise skip or set safe defaults—this change should be applied to the block
that references config.cond_dim and cond_data_statistics in the train_flow.py
flow.
- Around line 680-697: The conditional sampling uses a fixed batch size of (1,)
which mismatches cond_samples' batch size and causes the hstack with
original_data to fail; change the trained_flow.sample call to generate the same
number of samples as cond_samples (use cond_samples.shape[0] or n_conditions) so
flow_samples_np has shape (n_conditions, n_params), then keep the existing
inverse_standardize_data_zscore/inverse_standardize_data_minmax and hstack
logic; reference trained_flow.sample, cond_samples, flow_samples_np,
inverse_standardize_data_zscore/inverse_standardize_data_minmax,
config.parameter_names, config.cond_dim, original_data, and labels to locate
where to adjust the sampling shape and subsequent reshape.

---

Duplicate comments:
In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 170-208: The function standardize_data uses a mutable default for
parameter_names and logs the "Data bounds saved for inverse transformation"
inside the per-parameter loop; change the signature to use parameter_names:
Optional[list[str]] = None and at the start of standardize_data set
parameter_names = parameter_names or [] so callers get a fresh list, and move
the logger.info("Data bounds saved for inverse transformation") call out of the
for i, name in enumerate(parameter_names) loop (place it after the loop,
analogous to the zscore branch) so it logs once per operation; keep references
to data_statistics, standardize_data_zscore, and standardize_data_minmax intact.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 22cf5c49-35be-4110-9939-23e133b0adaf

📥 Commits

Reviewing files that changed from the base of the PR and between 41640ae and 65ece2e.

📒 Files selected for processing (1)

jesterTOV/inference/flows/train_flow.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

jesterTOV/inference/flows/train_flow.py (1)

199-207: ⚠️ Potential issue | 🟡 Minor

Move the min-max “bounds saved” log outside the loop.

On Line 206, "Data bounds saved for inverse transformation" is logged once per parameter instead of once per standardization step.

💡 Suggested change

     else:  # minmax
         logger.info("Standardizing data using min-max [0, 1] scaling...")
         data, data_statistics = standardize_data_minmax(data)
         logger.info("Standardized data ranges:")
         for i, name in enumerate(parameter_names):
             logger.info(
                 f"  {name}: [{data[:, i].min():.3f}, {data[:, i].max():.3f}]"
             )
-            logger.info("Data bounds saved for inverse transformation")
+        logger.info("Data bounds saved for inverse transformation")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/train_flow.py` around lines 199 - 207, The log
"Data bounds saved for inverse transformation" is inside the loop that iterates
parameter_names and thus gets emitted once per parameter; move that logger.info
call out of the for-loop in train_flow.py so it runs once after calling
standardize_data_minmax (after the loop that logs each parameter's min/max).
Update the block around logger.info("Standardizing data using min-max [0, 1]
scaling..."), the call to standardize_data_minmax(data), and the for i, name in
enumerate(parameter_names): loop to place the single "Data bounds saved..."
message immediately after the loop (or immediately after data_statistics is set)
so it logs once per standardization step.

🧹 Nitpick comments (1)

jesterTOV/inference/flows/train_flow.py (1)

678-697: Apply n_plot_samples cap in the conditional plotting path too.

n_plot_samples is computed on Line 678 but ignored when config.cond_dim is set; the conditional branch always uses all conditions. Capping here keeps plot generation bounded and consistent with the non-conditional path.

💡 Suggested change

             if config.cond_dim:
                 # get flow samples and untransform them
-                n_cond = cond_samples.shape[0]
-                flow_samples = trained_flow.sample(sample_key, (n_cond,), condition=cond_samples) # generate one sample per condition point
+                n_cond = min(n_plot_samples, cond_samples.shape[0])
+                cond_samples_plot = cond_samples[:n_cond]
+                flow_samples = trained_flow.sample(
+                    sample_key, (n_cond,), condition=cond_samples_plot
+                )  # generate one sample per condition point
                 flow_samples_np = np.array(flow_samples)
@@
                 flow_samples_np = np.hstack(
                     (flow_samples_np.reshape(-1, len(config.parameter_names)), original_data[:n_cond, -config.cond_dim:])
                 )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@jesterTOV/inference/flows/train_flow.py` around lines 678 - 697, The
conditional plotting branch ignores the n_plot_samples cap; modify the code
around the config.cond_dim path to limit n_cond to at most n_plot_samples by
computing n_cond = min(cond_samples.shape[0], n_plot_samples) and slice
cond_samples and original_data accordingly before calling
trained_flow.sample(sample_key, (n_cond,), condition=cond_samples) and before
stacking into flow_samples_np so that flow_samples_np and original_data[:n_cond,
...] use the capped size.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 175-187: Update the docstring for the function that standardizes
data (references: parameter names 'standardization_method' and
'parameter_names') to use Sphinx math roles: replace inline math like "mean=0"
and "std=1" with :math:`mean = 0` and :math:`std = 1`, and replace range
notation "[0, 1]" with :math:`[0, 1]`; if any expression should be displayed on
its own line convert it to a block using the .. math:: directive. Ensure the
textual description still mentions 'zscore' and min-max scaling but uses
:math:`...` or .. math:: for all mathematical fragments.

---

Duplicate comments:
In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 199-207: The log "Data bounds saved for inverse transformation" is
inside the loop that iterates parameter_names and thus gets emitted once per
parameter; move that logger.info call out of the for-loop in train_flow.py so it
runs once after calling standardize_data_minmax (after the loop that logs each
parameter's min/max). Update the block around logger.info("Standardizing data
using min-max [0, 1] scaling..."), the call to standardize_data_minmax(data),
and the for i, name in enumerate(parameter_names): loop to place the single
"Data bounds saved..." message immediately after the loop (or immediately after
data_statistics is set) so it logs once per standardization step.

---

Nitpick comments:
In `@jesterTOV/inference/flows/train_flow.py`:
- Around line 678-697: The conditional plotting branch ignores the
n_plot_samples cap; modify the code around the config.cond_dim path to limit
n_cond to at most n_plot_samples by computing n_cond =
min(cond_samples.shape[0], n_plot_samples) and slice cond_samples and
original_data accordingly before calling trained_flow.sample(sample_key,
(n_cond,), condition=cond_samples) and before stacking into flow_samples_np so
that flow_samples_np and original_data[:n_cond, ...] use the capped size.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7961070-4ff9-4245-b2ab-cf89f104a5e2

📥 Commits

Reviewing files that changed from the base of the PR and between 65ece2e and 0dd578a.

📒 Files selected for processing (1)

jesterTOV/inference/flows/train_flow.py

coderabbitai · 2026-04-13T14:28:04Z

+    """
+    Standardize data based on the selected standardiation method.
+
+    Args:
+        data: Array of shape (n_samples, n_features)
+        standardization_method: str
+            Which method to standardize with. 
+            Can either be 'zscore' for zero mean and unit std,
+            otherwise min-max scaling will be applied.
+        parameter_names:
+            Parameter names for which the rescaled data range should be printed.
+            Defaults to [].
+    """


⚠️ Potential issue | 🟡 Minor

Format math in docstrings using Sphinx math roles.

This new docstring includes mathematical expressions (mean=0, std=1, [0, 1]) but not Sphinx math markup.

As per coding guidelines, "All mathematical expressions in docstrings must use Sphinx/reStructuredText formatting with :math: role for inline math and .. math:: directive for display equations".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@jesterTOV/inference/flows/train_flow.py` around lines 175 - 187, Update the docstring for the function that standardizes data (references: parameter names 'standardization_method' and 'parameter_names') to use Sphinx math roles: replace inline math like "mean=0" and "std=1" with :math:`mean = 0` and :math:`std = 1`, and replace range notation "[0, 1]" with :math:`[0, 1]`; if any expression should be displayed on its own line convert it to a block using the .. math:: directive. Ensure the textual description still mentions 'zscore' and min-max scaling but uses :math:`...` or .. math:: for all mathematical fragments.

First attempt to add conditional NFs to jester infrastructure

f618d38

coderabbitai Bot reviewed Apr 10, 2026

View reviewed changes

A few fixes for the conditional flows

fd31ffd

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

haukekoehn added 2 commits April 13, 2026 11:11

Fix corner plot for conditional flow training

84a4bfb

Typo fixes in doc strings

41640ae

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

Fix corner plot generation for multi-dimensional cond. flows

65ece2e

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

Comment thread jesterTOV/inference/flows/train_flow.py

Comment thread jesterTOV/inference/flows/train_flow.py

Fixes in the corner plot generation 2

0dd578a

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

Final correct shape for corner plots

09c5741

		samples = self.flow.sample(key, shape, condition=y)

		samples = self.destandardize_output(samples)

Conversation

haukekoehn commented Apr 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated Code Review Effort

Poem

Uh oh!

haukekoehn commented Apr 10, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haukekoehn commented Apr 10, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 10, 2026 •

edited

Loading