Skip to content

Conversation

@SamuelBrand1
Copy link
Collaborator

This PR closes #780 .

This pull request introduces several enhancements and refactors to:

  • EpIAutoGP. These make the model flexible in doing daily or weekly forecast strides, and formats output to work more smoothly with post-processing.
  • pipelines/epiautogp. The python module that integrates running EpiAutoGP with the pyrenew-hew pipeline. The entrypoint script for this is forecast_epiautogp.py which is similar to forecast_pyrenew.py and forecast_timeseries.py.

forecast_epiautogp.py

Where possible I have reused functionality that exists in pipelines, however, to make dev easier I have introduced some classes and functions that wrap multiple steps. In forecast_epiautogp.py there are the following steps:

  • prelim step which generates a contextual model name, groups the parameters and exe flags into dicts etc
  • A pipeline setup step setup_forecast_pipeline which does the credential step and the existing data wrangling code and puts the pipeline information into a ForecastPipelineContext dataclass object. This reduced the amount of parameters that need passing around.
  • A data setup step which calls a method on the pipeline context to set the data up for model usage and returns a ModelPaths dataclass object to hold the various paths that get passed around.
  • A specific EpiAutoGP data set up that creates a new JSON data file for EpiAutoGP to use.
  • A step that runs the EpiAutoGP model
  • A post-processing step which calls a method on the pipeline context. This does the output formatting, plotting and hubverse table creation. For this I had to write some specific functions for EpiAutoGP but it remains based on current post-processing functions.

End to end testing

I've add the integration test pipelines/tests/test_epiautogp_end_to_end.sh which matches the structure of the existing end to end tests but currently only for covid. Model options cover:

  • running and forecasting on weekly NHSN data
  • Weekly NSSP % ED visits
  • daily ED visit counts
  • daily other ED visits counts.

@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

❌ Patch coverage is 68.15068% with 93 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (epiautogp@1b3bf63). Learn more about missing BASE report.

Files with missing lines Patch % Lines
pipelines/epiautogp/forecast_epiautogp.py 0.00% 41 Missing ⚠️
hewr/R/process_loc_forecast.R 0.00% 30 Missing ⚠️
pipelines/epiautogp/prep_epiautogp_data.py 4.34% 22 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             epiautogp     #783   +/-   ##
============================================
  Coverage             ?   45.78%           
============================================
  Files                ?       35           
  Lines                ?     3191           
  Branches             ?        0           
============================================
  Hits                 ?     1461           
  Misses               ?     1730           
  Partials             ?        0           
Flag Coverage Δ
hewr 36.77% <0.00%> (?)
pipelines 45.48% <75.95%> (?)
pyrenew_hew 62.29% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@damonbayer damonbayer requested a review from Copilot December 12, 2025 22:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive forecasting pipeline for the EpiAutoGP model, enabling it to forecast both daily and weekly ED visits (NSSP) and hospital admissions (NHSN). The implementation follows the existing pipeline patterns while introducing new utilities for data conversion, model execution, and post-processing specific to EpiAutoGP's Julia-based workflow.

Key Changes

  • Introduced forecast_epiautogp.py as the main entry point for the EpiAutoGP forecasting pipeline
  • Added shared pipeline utilities (ForecastPipelineContext, ModelPaths, setup_forecast_pipeline) to reduce code duplication
  • Enhanced Julia EpiAutoGP model to support flexible daily/weekly forecast strides with improved parameter naming

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pipelines/tests/test_prep_epiautogp_data.py Removed (tests moved or consolidated)
pipelines/tests/test_forecast_utils.py New unit tests for forecast pipeline utilities using mocks
pipelines/tests/test_epiautogp_prep_script.py Removed preparation script tests
pipelines/tests/test_epiautogp_prep.sh Removed shell test for data preparation
pipelines/tests/test_epiautogp_fit.sh New shell script to run single EpiAutoGP forecasts
pipelines/tests/test_epiautogp_end_to_end.sh New comprehensive end-to-end integration test
pipelines/forecast_pyrenew.py Fixed import organization (moved to absolute imports)
pipelines/epiautogp/process_epiautogp_forecast.py New post-processing utilities for EpiAutoGP outputs
pipelines/epiautogp/prep_epiautogp_data.py Enhanced data conversion with context/paths pattern and ed_visit_type support
pipelines/epiautogp/plot_epiautogp_forecast.R New R plotting script for EpiAutoGP-specific visualizations
pipelines/epiautogp/forecast_epiautogp.py Main pipeline entry point orchestrating all steps
pipelines/epiautogp/epiautogp_forecast_utils.py Shared utilities and dataclasses for pipeline stages
pipelines/epiautogp/init.py Updated exports for new utilities
pipelines/epiautogp/README.md Comprehensive documentation of pipeline architecture
EpiAutoGP/test/test_parse_arguments.jl Updated test for renamed parameter
EpiAutoGP/test/test_output.jl Added new required fields to test data
EpiAutoGP/test/test_modelling.jl Updated tests for renamed parameter and added daily frequency test
EpiAutoGP/test/test_input.jl Updated all test inputs with new required fields
EpiAutoGP/src/parse_arguments.jl Renamed n-forecast-weeks to n-ahead for flexibility
EpiAutoGP/src/output.jl Added PipelineOutput type and refactored output creation
EpiAutoGP/src/modelling.jl Updated to support daily/weekly frequencies with time_step calculation
EpiAutoGP/src/input.jl Added frequency, use_percentage, and ed_visit_type fields
EpiAutoGP/src/EpiAutoGP.jl Added Parquet dependency and new constants
EpiAutoGP/run.jl Switched default output type to PipelineOutput
EpiAutoGP/Project.toml Added Parquet dependency and reordered authors field

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 26 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@damonbayer damonbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few open ended questions. I feel somewhat strongly that pipelines/epiautogp/process_epiautogp_forecast.py should be implemented in hewr/R/process_loc_forecast.R. Maintaining both seems risky.

@SamuelBrand1
Copy link
Collaborator Author

SamuelBrand1 commented Dec 15, 2025

A few open ended questions. I feel somewhat strongly that pipelines/epiautogp/process_epiautogp_forecast.py should be implemented in hewr/R/process_loc_forecast.R. Maintaining both seems risky.

Can we make an issue to make process_loc_forecast.R more flexible? And a sub-issue to modify to use this?

At the moment, process_loc_forecast.R takes in pyrenew_model_name, timeseries_model_name with the logic depending on which arg is NA. I'd need to hack this quite a lot to make it flexible. I think the cleanest way to do this would be with some kind of S3 method specific for each model, with a new model needing a new method.

@damonbayer
Copy link
Collaborator

@SamuelBrand1 I'll make an issue and offer my thoughts.

@SamuelBrand1
Copy link
Collaborator Author

@SamuelBrand1 I'll make an issue and offer my thoughts.

Actually I think #781 already covers this?

@SamuelBrand1
Copy link
Collaborator Author

A few open ended questions. I feel somewhat strongly that pipelines/epiautogp/process_epiautogp_forecast.py should be implemented in hewr/R/process_loc_forecast.R. Maintaining both seems risky.

Using the S3 generic introduced in #790 I've added a method for handling the epiautogp output and use the hewr postprocessing.

@SamuelBrand1
Copy link
Collaborator Author

#790 broke the full pipeline at some point in plotting... so this is reblocked.

@SamuelBrand1 SamuelBrand1 force-pushed the 780-add-forecast_epiautogp-function branch from d89d3e4 to 60befce Compare December 17, 2025 21:33
@SamuelBrand1 SamuelBrand1 force-pushed the 780-add-forecast_epiautogp-function branch from 60befce to abe63e8 Compare December 17, 2025 22:46
@SamuelBrand1 SamuelBrand1 force-pushed the 780-add-forecast_epiautogp-function branch from abe63e8 to 26bba0b Compare December 17, 2025 23:00
@SamuelBrand1
Copy link
Collaborator Author

A few open ended questions. I feel somewhat strongly that pipelines/epiautogp/process_epiautogp_forecast.py should be implemented in hewr/R/process_loc_forecast.R. Maintaining both seems risky.

Using the S3 generic introduced in #790 I've added a method for handling the epiautogp output and use the hewr postprocessing.

I've rebased epiautogp branch on the fixed main branch, and added a new model_name arg to the interface to hewr from pipelines.

The way this works is that if no model_name is given then it falls back on the current patterns. If a model_name is supplied then hewr detects the model_type by pattern matching, and passes along to the appropriate post-processing method.

The idea is to stop endless creep of new model specific parameters, although I acknowledge that the auto-detect pattern is pretty bad for this. That can be fixed in a new issue.

This is a PR into epiautogp so doesn't trigger full CI, but locally I've run the end-to-end integration for the current models.

SamuelBrand1 and others added 22 commits December 30, 2025 09:41
Replaces the use of process_epiautogp_forecast and a custom R plotting script with plot_and_save_loc_forecast, which handles both sample processing and plotting via hewr. Updates the post_process_forecast method to streamline steps and improve maintainability.
Introduces the process_model_samples.epiautogp S3 method and interface from pipelines
Replaces the epiautogp_model_name parameter with a generic model_name in plot_and_save_loc_forecast and related calls. Updates documentation and tests to reflect the new parameter, enabling auto-detection and dispatching for different model types.
Refactor logic to determine the correct samples file based on both frequency (epiweekly or daily) and target type (NHSN or NSSP) from the model name. This makes the file selection more robust and explicit, and adds error handling for unknown target types.
Introduces a new --n-threads argument to specify the number of threads used for EpiAutoGP computations, defaulting to 1. This allows users to control parallelism directly from the command line.
Introduces forecast_utils.py with dataclasses and functions for setting up, preparing, and postprocessing forecast pipeline runs. Includes comprehensive unit tests for all major utilities, using mocking to isolate dependencies and verify correct logic and file structure handling.
Introduces DEFAULT_TARGET_LETTER mapping for target abbreviations and updates the Parquet output filename in create_forecast_output to use the appropriate target letter for hubverse compatibility. Also adds geo_value and disease columns to the forecast output for improved metadata.
Renamed forecast_utils.py to epiautogp_forecast_utils.py and updated all imports accordingly. Refactored the EpiAutoGP pipeline to use a context object for configuration, streamlined argument passing, and improved modularity. Added a new R plotting script (plot_epiautogp_forecast.R) for EpiAutoGP outputs. Introduced end-to-end and fit test shell scripts for automated testing. Removed obsolete prep test scripts. Updated process_epiautogp_forecast.py to simplify output processing and match R plotting expectations.
Introduces the 'ed_visit_type' parameter to allow selection between 'observed' and 'other' ED visits for NSSP targets throughout the EpiAutoGP pipeline. Updates parameter validation, data extraction, and model naming to support this distinction, and adjusts CLI and function signatures accordingly. Also ensures correct forecast sample file selection based on frequency.
Copy link
Collaborator

@damonbayer damonbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SamuelBrand1!

@SamuelBrand1 SamuelBrand1 merged commit c52d745 into epiautogp Jan 5, 2026
8 checks passed
@SamuelBrand1 SamuelBrand1 deleted the 780-add-forecast_epiautogp-function branch January 5, 2026 21:55
SamuelBrand1 added a commit that referenced this pull request Jan 5, 2026
* Add shared forecast pipeline utilities and tests

Introduces forecast_utils.py with dataclasses and functions for setting up, preparing, and postprocessing forecast pipeline runs. Includes comprehensive unit tests for all major utilities, using mocking to isolate dependencies and verify correct logic and file structure handling.

* add Parquet dep

* reduce docstring bloat

* Add PipelineOutput support for pipeline forecasts

Outputs to expected parquet format

* Add DEFAULT_TARGET_LETTER and update output filenames

Introduces DEFAULT_TARGET_LETTER mapping for target abbreviations and updates the Parquet output filename in create_forecast_output to use the appropriate target letter for hubverse compatibility. Also adds geo_value and disease columns to the forecast output for improved metadata.

* move utils and rename paths dataclass

* Add use_percentage flag to EpiAutoGPInput and output logic

Introduces a use_percentage boolean field to EpiAutoGPInput to distinguish between raw counts and percentage-based input for ED visits. Updates output logic to set the variable name and convert values to proportions when use_percentage is true for nssp targets. Test cases and input construction are updated accordingly.

* Refactor EpiAutoGP pipeline and add end-to-end tests

Renamed forecast_utils.py to epiautogp_forecast_utils.py and updated all imports accordingly. Refactored the EpiAutoGP pipeline to use a context object for configuration, streamlined argument passing, and improved modularity. Added a new R plotting script (plot_epiautogp_forecast.R) for EpiAutoGP outputs. Introduced end-to-end and fit test shell scripts for automated testing. Removed obsolete prep test scripts. Updated process_epiautogp_forecast.py to simplify output processing and match R plotting expectations.

* Update .gitignore

* Refactor EpiAutoGP post-processing into utility function

Consolidated forecast post-processing steps (processing outputs, creating hubverse table, and plotting) into a single post_process_forecast utility in epiautogp_forecast_utils.py. Updated imports and usage in __init__.py and forecast_epiautogp.py for improved modularity and code reuse. Added param_data_dir to ForecastPipelineContext and setup_forecast_pipeline.

* Refactor forecast utils to use context methods

Moved prepare_model_data and post_process_forecast functions into ForecastPipelineContext as methods. Updated imports and usage in forecast_epiautogp.py and __init__.py to use the new class methods, improving encapsulation and code organization.

* Update README.md

* Add frequency to input and generalize forecast horizon

Introduces a 'frequency' field to EpiAutoGPInput to support both daily and epiweekly data. Refactors modelling and argument parsing to use a generic 'n_ahead' parameter (number of time steps) instead of 'n_forecast_weeks', and updates all related documentation, tests, and function signatures for consistency and flexibility.

* Add ed_visit_type to input and output handling

Introduces the ed_visit_type field to EpiAutoGPInput for specifying the type of ED visits, updates output logic to use this field for column selection, and adjusts tests and documentation accordingly. Also updates output file naming to use the frequency prefix.

* Add ed_visit_type param for NSSP/ED visit modeling

Introduces the 'ed_visit_type' parameter to allow selection between 'observed' and 'other' ED visits for NSSP targets throughout the EpiAutoGP pipeline. Updates parameter validation, data extraction, and model naming to support this distinction, and adjusts CLI and function signatures accordingly. Also ensures correct forecast sample file selection based on frequency.

* Add daily NSSP forecast tests and support for ED visit type

Expanded end-to-end and fit test scripts to include daily NSSP count and 'other ED visits' forecasts. Updated argument handling in test_epiautogp_fit.sh to support an optional ed_visit_type parameter and adjusted expected model counts accordingly.

* Refactor forecast utils tests and remove prep_epiautogp tests

Updated test_forecast_utils.py to use new ForecastPipelineContext interface, updated patch paths, and migrated to context methods for prepare_model_data and post_process_forecast. Removed test_prep_epiautogp_data.py as part of test suite cleanup.

* update epiautogp docstrings

* Update prep_epiautogp_data.py

* Update output.jl

* add nhsn test coverage

* reorg unit tests

* caught anti-pattern

* Update pipelines/epiautogp/process_epiautogp_forecast.py

Co-authored-by: Copilot <[email protected]>

* explain use of percentage

* Refactor forecast post-processing to use hewr plotting

Replaces the use of process_epiautogp_forecast and a custom R plotting script with plot_and_save_loc_forecast, which handles both sample processing and plotting via hewr. Updates the post_process_forecast method to streamline steps and improve maintainability.

* remove redundant files

* update tests due to removed funcs

* Add EpiAutoGP model support to process_loc_forecast

Introduces the process_model_samples.epiautogp S3 method and interface from pipelines

* Refactor to use generic model_name in forecast utilities

Replaces the epiautogp_model_name parameter with a generic model_name in plot_and_save_loc_forecast and related calls. Updates documentation and tests to reflect the new parameter, enabling auto-detection and dispatching for different model types.

* Improve sample file selection in process_loc_forecast.R

Refactor logic to determine the correct samples file based on both frequency (epiweekly or daily) and target type (NHSN or NSSP) from the model name. This makes the file selection more robust and explicit, and adds error handling for unknown target types.

* Add n-threads argument to CLI for EpiAutoGP

Introduces a new --n-threads argument to specify the number of threads used for EpiAutoGP computations, defaulting to 1. This allows users to control parallelism directly from the command line.

* Add shared forecast pipeline utilities and tests

Introduces forecast_utils.py with dataclasses and functions for setting up, preparing, and postprocessing forecast pipeline runs. Includes comprehensive unit tests for all major utilities, using mocking to isolate dependencies and verify correct logic and file structure handling.

* change to relative imports

* Add DEFAULT_TARGET_LETTER and update output filenames

Introduces DEFAULT_TARGET_LETTER mapping for target abbreviations and updates the Parquet output filename in create_forecast_output to use the appropriate target letter for hubverse compatibility. Also adds geo_value and disease columns to the forecast output for improved metadata.

* move utils and rename paths dataclass

* Refactor EpiAutoGP pipeline and add end-to-end tests

Renamed forecast_utils.py to epiautogp_forecast_utils.py and updated all imports accordingly. Refactored the EpiAutoGP pipeline to use a context object for configuration, streamlined argument passing, and improved modularity. Added a new R plotting script (plot_epiautogp_forecast.R) for EpiAutoGP outputs. Introduced end-to-end and fit test shell scripts for automated testing. Removed obsolete prep test scripts. Updated process_epiautogp_forecast.py to simplify output processing and match R plotting expectations.

* Add ed_visit_type param for NSSP/ED visit modeling

Introduces the 'ed_visit_type' parameter to allow selection between 'observed' and 'other' ED visits for NSSP targets throughout the EpiAutoGP pipeline. Updates parameter validation, data extraction, and model naming to support this distinction, and adjusts CLI and function signatures accordingly. Also ensures correct forecast sample file selection based on frequency.

* update epiautogp docstrings

* caught anti-pattern

* Update pipelines/epiautogp/process_epiautogp_forecast.py

Co-authored-by: Copilot <[email protected]>

* remove redundant files

---------

Co-authored-by: Copilot <[email protected]>
SamuelBrand1 added a commit that referenced this pull request Jan 8, 2026
* Add shared forecast pipeline utilities and tests

Introduces forecast_utils.py with dataclasses and functions for setting up, preparing, and postprocessing forecast pipeline runs. Includes comprehensive unit tests for all major utilities, using mocking to isolate dependencies and verify correct logic and file structure handling.

* add Parquet dep

* reduce docstring bloat

* Add PipelineOutput support for pipeline forecasts

Outputs to expected parquet format

* Add DEFAULT_TARGET_LETTER and update output filenames

Introduces DEFAULT_TARGET_LETTER mapping for target abbreviations and updates the Parquet output filename in create_forecast_output to use the appropriate target letter for hubverse compatibility. Also adds geo_value and disease columns to the forecast output for improved metadata.

* move utils and rename paths dataclass

* Add use_percentage flag to EpiAutoGPInput and output logic

Introduces a use_percentage boolean field to EpiAutoGPInput to distinguish between raw counts and percentage-based input for ED visits. Updates output logic to set the variable name and convert values to proportions when use_percentage is true for nssp targets. Test cases and input construction are updated accordingly.

* Refactor EpiAutoGP pipeline and add end-to-end tests

Renamed forecast_utils.py to epiautogp_forecast_utils.py and updated all imports accordingly. Refactored the EpiAutoGP pipeline to use a context object for configuration, streamlined argument passing, and improved modularity. Added a new R plotting script (plot_epiautogp_forecast.R) for EpiAutoGP outputs. Introduced end-to-end and fit test shell scripts for automated testing. Removed obsolete prep test scripts. Updated process_epiautogp_forecast.py to simplify output processing and match R plotting expectations.

* Update .gitignore

* Refactor EpiAutoGP post-processing into utility function

Consolidated forecast post-processing steps (processing outputs, creating hubverse table, and plotting) into a single post_process_forecast utility in epiautogp_forecast_utils.py. Updated imports and usage in __init__.py and forecast_epiautogp.py for improved modularity and code reuse. Added param_data_dir to ForecastPipelineContext and setup_forecast_pipeline.

* Refactor forecast utils to use context methods

Moved prepare_model_data and post_process_forecast functions into ForecastPipelineContext as methods. Updated imports and usage in forecast_epiautogp.py and __init__.py to use the new class methods, improving encapsulation and code organization.

* Update README.md

* Add frequency to input and generalize forecast horizon

Introduces a 'frequency' field to EpiAutoGPInput to support both daily and epiweekly data. Refactors modelling and argument parsing to use a generic 'n_ahead' parameter (number of time steps) instead of 'n_forecast_weeks', and updates all related documentation, tests, and function signatures for consistency and flexibility.

* Add ed_visit_type to input and output handling

Introduces the ed_visit_type field to EpiAutoGPInput for specifying the type of ED visits, updates output logic to use this field for column selection, and adjusts tests and documentation accordingly. Also updates output file naming to use the frequency prefix.

* Add ed_visit_type param for NSSP/ED visit modeling

Introduces the 'ed_visit_type' parameter to allow selection between 'observed' and 'other' ED visits for NSSP targets throughout the EpiAutoGP pipeline. Updates parameter validation, data extraction, and model naming to support this distinction, and adjusts CLI and function signatures accordingly. Also ensures correct forecast sample file selection based on frequency.

* Add daily NSSP forecast tests and support for ED visit type

Expanded end-to-end and fit test scripts to include daily NSSP count and 'other ED visits' forecasts. Updated argument handling in test_epiautogp_fit.sh to support an optional ed_visit_type parameter and adjusted expected model counts accordingly.

* Refactor forecast utils tests and remove prep_epiautogp tests

Updated test_forecast_utils.py to use new ForecastPipelineContext interface, updated patch paths, and migrated to context methods for prepare_model_data and post_process_forecast. Removed test_prep_epiautogp_data.py as part of test suite cleanup.

* update epiautogp docstrings

* Update prep_epiautogp_data.py

* Update output.jl

* add nhsn test coverage

* reorg unit tests

* caught anti-pattern

* Update pipelines/epiautogp/process_epiautogp_forecast.py

Co-authored-by: Copilot <[email protected]>

* explain use of percentage

* Refactor forecast post-processing to use hewr plotting

Replaces the use of process_epiautogp_forecast and a custom R plotting script with plot_and_save_loc_forecast, which handles both sample processing and plotting via hewr. Updates the post_process_forecast method to streamline steps and improve maintainability.

* remove redundant files

* update tests due to removed funcs

* Add EpiAutoGP model support to process_loc_forecast

Introduces the process_model_samples.epiautogp S3 method and interface from pipelines

* Refactor to use generic model_name in forecast utilities

Replaces the epiautogp_model_name parameter with a generic model_name in plot_and_save_loc_forecast and related calls. Updates documentation and tests to reflect the new parameter, enabling auto-detection and dispatching for different model types.

* Improve sample file selection in process_loc_forecast.R

Refactor logic to determine the correct samples file based on both frequency (epiweekly or daily) and target type (NHSN or NSSP) from the model name. This makes the file selection more robust and explicit, and adds error handling for unknown target types.

* Add n-threads argument to CLI for EpiAutoGP

Introduces a new --n-threads argument to specify the number of threads used for EpiAutoGP computations, defaulting to 1. This allows users to control parallelism directly from the command line.

* Add shared forecast pipeline utilities and tests

Introduces forecast_utils.py with dataclasses and functions for setting up, preparing, and postprocessing forecast pipeline runs. Includes comprehensive unit tests for all major utilities, using mocking to isolate dependencies and verify correct logic and file structure handling.

* change to relative imports

* Add DEFAULT_TARGET_LETTER and update output filenames

Introduces DEFAULT_TARGET_LETTER mapping for target abbreviations and updates the Parquet output filename in create_forecast_output to use the appropriate target letter for hubverse compatibility. Also adds geo_value and disease columns to the forecast output for improved metadata.

* move utils and rename paths dataclass

* Refactor EpiAutoGP pipeline and add end-to-end tests

Renamed forecast_utils.py to epiautogp_forecast_utils.py and updated all imports accordingly. Refactored the EpiAutoGP pipeline to use a context object for configuration, streamlined argument passing, and improved modularity. Added a new R plotting script (plot_epiautogp_forecast.R) for EpiAutoGP outputs. Introduced end-to-end and fit test shell scripts for automated testing. Removed obsolete prep test scripts. Updated process_epiautogp_forecast.py to simplify output processing and match R plotting expectations.

* Add ed_visit_type param for NSSP/ED visit modeling

Introduces the 'ed_visit_type' parameter to allow selection between 'observed' and 'other' ED visits for NSSP targets throughout the EpiAutoGP pipeline. Updates parameter validation, data extraction, and model naming to support this distinction, and adjusts CLI and function signatures accordingly. Also ensures correct forecast sample file selection based on frequency.

* update epiautogp docstrings

* caught anti-pattern

* Update pipelines/epiautogp/process_epiautogp_forecast.py

Co-authored-by: Copilot <[email protected]>

* remove redundant files

---------

Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants