Skip to content

Conversation

Mohammad-Tayyab-Frequenz
Copy link
Contributor

No description provided.

@Mohammad-Tayyab-Frequenz Mohammad-Tayyab-Frequenz requested a review from a team as a code owner September 24, 2025 10:48
@github-actions github-actions bot added part:docs Affects the documentation part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) labels Sep 24, 2025
@Mohammad-Tayyab-Frequenz Mohammad-Tayyab-Frequenz force-pushed the add-schema-mapping branch 2 times, most recently from 4afd9b8 to de48ef9 Compare September 24, 2025 13:17
Copy link
Contributor

@cwasicki cwasicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just from this PR it's not clear what this is about because it is missing PR and commit descriptions and the doc is not sufficient. This is missing a motivation why this class is needed (in the future) and how it is intended to be used.

I am also not sure about a couple of name matchings. This stresses the point that we need well-defined metric definitions. In fact I think we should add the module with the definitions of all of these metrics used here either before this PR or add them here.

"""Column mapping utilities for energy reporting.

Provides the `ColumnMapper` dataclass to manage renaming between raw,
canonical, and localized display column names. Supports loading schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are raw, canonical and display names? Where are these used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this in the documentation.

Raw - directly from reporting api
Canonical - used in our codebase
Display - To be displayed in the notebooks

@cyiallou
Copy link
Contributor

When I added EN/DE support for the solar notebook, I faced a similar challenge. At the time I went for a simple TranslationManager class with an internal dictionary of text + translations (link). It’s admittedly not the most elegant solution, but it avoids introducing an extra dependency in this repo (which is itself consumed elsewhere). That trade-off was one of the reasons I kept it simple.

I see the YAML approach here is more structured (which is definitely a plus for maintainability). Just wanted to flag the existing translator.py so you can consider whether it makes sense to adapt/reuse it, or whether your case really needs a separate schema + class.

@Mohammad-Tayyab-Frequenz
Copy link
Contributor Author

The translator.py script looks well-structured and offers a wide range of functionalities. My current solution is more narrowly focused, handling only the translation of column display names into the required language. It might be worthwhile to combine both approaches and define a comprehensive configuration within a .yaml file. Do you think this would be a feasible path forward?

Updates the requirements on [kaleido](https://github.com/plotly/kaleido) to permit the latest version.
- [Release notes](https://github.com/plotly/kaleido/releases)
- [Commits](plotly/Kaleido@v0.2.1...v1.1.0)

---
updated-dependencies:
- dependency-name: kaleido
  dependency-version: 1.1.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
@Mohammad-Tayyab-Frequenz Mohammad-Tayyab-Frequenz force-pushed the add-schema-mapping branch 2 times, most recently from e659ea3 to dae7d8c Compare October 2, 2025 10:08
df_with_pv_flows["pv_self"] = (
df_with_pv_flows["pv_prod"] - df_with_pv_flows["pv_excess"]
).clip(lower=0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if calculating pv_self as shown here the correct way of doing this. I have picked this directly from the previous reporting NBs.

Signed-off-by: Mohammad Tayyab <[email protected]>
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a column mapping utility for energy reporting notebooks to insulate them from upstream schema changes. The implementation provides locale-aware column name translations and timezone handling through a YAML configuration file.

Key changes:

  • New ColumnMapper class that translates between raw API headers, canonical names, and localized display labels
  • YAML schema configuration file defining column mappings, descriptions, and locale-specific labels
  • Helper functions for PV energy flow calculations

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/frequenz/lib/notebooks/reporting/utils/column_mapper.py Core ColumnMapper class with YAML loading and locale-aware column translation
src/frequenz/lib/notebooks/reporting/schema_mapping.yaml Schema configuration defining column mappings and display labels
src/frequenz/lib/notebooks/reporting/utils/helpers.py PV energy flow calculation utilities
src/frequenz/lib/notebooks/reporting/utils/__init__.py Package initialization file
pyproject.toml Added PyYAML dependencies and updated kaleido version
RELEASE_NOTES.md Documentation of the new ColumnMapper feature

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Args:
df: Input DataFrame. If present, uses columns ``pv_neg``,
``consumption``, and ``battery_pos``. Missing columns are
treated as zeros.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would raise if any of the expected columns is missing, otherwise it's easy to create typo bugs.

Copy link
Contributor Author

@Mohammad-Tayyab-Frequenz Mohammad-Tayyab-Frequenz Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I picked it from the notebooks. I think previously sometimes consumption column would be missing so this logic was implemented but now we should raise if any column name is missing.
But one question would be whether pv_columns and battery columns will be present in the data even if the component is missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think I need to add the logic to check whether the component type is present in the data. Then forward with the checks/calculations otherwise pass.

Newly created/updated columns:
- ``pv_prod``: PV production as a positive series (negated/clipped from
``pv_neg``).
- ``pv_excess``: Excess PV after subtracting household consumption.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

positive or negative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, currently the logic is to multiply pv_neg * -1 to get the pv prod in positive float.

import pandas as pd


def _add_pv_energy_flows(df: pd.DataFrame) -> pd.DataFrame:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is problematic if there is more than one source for production, e.g. from CHP. Options to deal with this could be to

  • Make this to work not with PV but "production" in general.
  • Enforce that PV is the only source and otherwise raise. Then we cannot use it on certain (already existing) microgrids though, so this can only be a temporary solution.
  • Take into account all sources. This requires some accounting logic though, which can be case-dependent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

part:docs Affects the documentation part:tooling Affects the development tooling (CI, deployment, dependency management, etc.)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants