Add data_processing module to reporting #102

flora-hofmann-frequenz · 2025-06-12T14:50:57Z

An effort to take out all data processing functions from the Reporting NB in Deepnote.

Example NB

Copilot

Pull Request Overview

This PR extracts and consolidates data processing routines from the Reporting notebook into a dedicated data_processing module.

Adds a new data_processing.py with timezone conversion, grid/PV/battery metrics, and analysis functions.
Updates RELEASE_NOTES.md to announce the new module.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
src/frequenz/lib/notebooks/reporting/data_processing.py	New module containing extracted data processing functions for reporting
RELEASE_NOTES.md	Added entry for `data_processing` module

Comments suppressed due to low confidence (3)

src/frequenz/lib/notebooks/reporting/data_processing.py:45

[nitpick] The parameter name mcfg is ambiguous; consider renaming it to something more descriptive like config or providing a proper type alias for clarity.

def rename_component_columns(df: pd.DataFrame, component_types: List[str], mcfg: Any) -> pd.DataFrame:

src/frequenz/lib/notebooks/reporting/data_processing.py:317

The bat_filter parameter is a string but handled like a list in code; consider unifying filter parameter types (e.g., List[str]) for consistency with the PV analysis API.

def create_battery_analyse_df(master_df: pd.DataFrame, bat_filter: str) -> pd.DataFrame:

src/frequenz/lib/notebooks/reporting/data_processing.py:4

[nitpick] Expand the module docstring to describe expected input DataFrame schemas (required column names and types) and provide a high-level overview of the functions included.

"""Data processing functions for the reporting module."""

src/frequenz/lib/notebooks/reporting/data_processing.py

cwasicki · 2025-06-12T15:41:16Z

An effort to take out all data processing functions from the Reporting NB in Deepnote.

Are these all copied as-is from the reporting notebook?

cyiallou · 2025-06-12T19:35:08Z

Are these all copied as-is from the reporting notebook?

If so, do you plan to clean them up in this PR, or in a follow-up? Just asking because I noticed a few spots that could probably use a bit of tidying.

flora-hofmann-frequenz · 2025-06-13T07:33:02Z

If so, do you plan to clean them up in this PR, or in a follow-up? Just asking because I noticed a few spots that could probably use a bit of tidying.

Happy to clean either in this PR or in the next one. What would you recommend?

They are not exactly as in the reporting notebook at most blocks in the notebook are not even functions yet and have been add by many different programmers (Malte, Noah, me...). As a first step I turned all of them into functions and made sure the notebook runs smoothly (with testing of different use cases) using them as they are here now.

cyiallou

Thanks for the implementation. I have left a few comments mainly around consistency, maintainability, and minor architectural points. Nothing blocking, but worth addressing where possible I think. Some general comments:

DataFrame mutation: Some functions modify the input DataFrame in place. Consider calling .copy() early or documenting the mutation clearly if this is intended.
Hardcoded column names: Strings like "PV Produktion" and "Netzanschluss" are repeated throughout. Suggest extracting them as constants to reduce errors and improve maintainability. There are more hardcoded strings throughout the code.
Timezone inconsistency: convert_timezone() uses "Europe/Berlin", while compute_peak_usage() uses "CET". Consider unifying this with a shared constant.
Docstrings are missing Args and Returns. Adding them would improve clarity.
Consider whether adding tests for some functions makes sense.

src/frequenz/lib/notebooks/reporting/data_processing.py

Signed-off-by: Flora <[email protected]>

src/frequenz/lib/notebooks/reporting/data_processing.py

Signed-off-by: Flora <[email protected]>

flora-hofmann-frequenz · 2025-07-11T11:12:37Z

Addressed in PR #106

Copilot AI review requested due to automatic review settings June 12, 2025 14:50

flora-hofmann-frequenz requested a review from a team as a code owner June 12, 2025 14:50

flora-hofmann-frequenz requested a review from phillip-wenig-frequenz June 12, 2025 14:50

github-actions bot added the part:docs Affects the documentation label Jun 12, 2025

Copilot AI reviewed Jun 12, 2025

View reviewed changes

cyiallou reviewed Jun 17, 2025

View reviewed changes

Add data_processing module to reporting

9bbc8fd

Signed-off-by: Flora <[email protected]>

flora-hofmann-frequenz force-pushed the add_dataprocessing branch from efcfa64 to 9bbc8fd Compare June 18, 2025 15:03

cwasicki reviewed Jun 18, 2025

View reviewed changes

Improvements on documentation and naming

136d5c6

Signed-off-by: Flora <[email protected]>

flora-hofmann-frequenz mentioned this pull request Jun 20, 2025

Add refactored data processing module to reporting #106

Open

flora-hofmann-frequenz closed this Jul 11, 2025

Add data_processing module to reporting #102

Add data_processing module to reporting #102

Uh oh!

Conversation

flora-hofmann-frequenz commented Jun 12, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cwasicki commented Jun 12, 2025

Uh oh!

cyiallou commented Jun 12, 2025

Uh oh!

flora-hofmann-frequenz commented Jun 13, 2025

Uh oh!

cyiallou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

flora-hofmann-frequenz commented Jul 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants