Skip to content

Conversation

@cdc-mitzimorris
Copy link
Contributor

This PR is the companion to CDCgov/PyRenew#600 which adds general utilities for handing time indexing and time slicing to PyRenew by identifying code which was implemented in pyrenew-hew and is a good candidate for reuse.

@codecov
Copy link

codecov bot commented Sep 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 14.82%. Comparing base (043fae3) to head (f6a992a).

❗ There is a different number of reports uploaded between BASE (043fae3) and HEAD (f6a992a). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (043fae3) HEAD (f6a992a)
pyrenew_hew 1 0
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #709       +/-   ##
===========================================
- Coverage   26.08%   14.82%   -11.26%     
===========================================
  Files          26       22        -4     
  Lines        2427     1862      -565     
===========================================
- Hits          633      276      -357     
+ Misses       1794     1586      -208     
Flag Coverage Δ
hewr 29.93% <ø> (ø)
pipelines 3.05% <ø> (ø)
pyrenew_hew ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cdc-mitzimorris and others added 10 commits September 30, 2025 17:26
…yrenew-hew into mem_refactor_datetime_indexing
- Replace date-to-model-time list comprehensions with align_observation_times()
- Fix Saturday edge case in to_forecast_data() to correctly add 7 days when start date is already a Saturday
- Replace complex ceiling division with get_first_week_on_or_after_t0() utility
- Add test for Saturday edge case
- Update test assertion to allow up to 7 days offset for MMWR week alignment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Rename date properties (e.g., dates_observed_ed_visits → ed_dates)
- Rename first/last date properties (e.g., first_ed_visits_date → ed_first_date)
- Rename observation data (e.g., data_observed_disease_ed_visits → ed_visit_count)
- Rename time indices (e.g., model_t_obs_ed_visits → ed_obs_time)
- Remove model_ prefix from time variables (e.g., model_t_first_latent_infection → t_first_infection)
- Rename array indices (e.g., which_obs_ed_visits → ed_obs_idx)
- Change to singular form (e.g., ww_observed_subpops → ww_obs_subpop)
- Update all tests accordingly
- All 49 tests passing
@SamuelBrand1
Copy link
Collaborator

SamuelBrand1 commented Oct 8, 2025

Hey @cdc-mitzimorris ,

The uv.lock file is committed to the repo and that specifies [email protected] rather than just the latest version. You need to refresh that with

% uv sync --upgrade-package pyrenew  

commit it and I think the CI will work (at least the local python tests did on my machine).

@cdc-mitzimorris cdc-mitzimorris marked this pull request as ready for review October 8, 2025 17:51
@SamuelBrand1 SamuelBrand1 requested a review from Copilot October 8, 2025 17:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR moves general datetime indexing and time handling utilities from the pyrenew-hew package to the PyRenew core library to enable code reuse across projects. It replaces custom datetime manipulation logic with standardized functions from PyRenew's time module.

Key changes:

  • Replaces custom date validation and time indexing functions with imports from pyrenew.time
  • Updates hospital admissions date calculation logic to use centralized utilities
  • Adds comprehensive test coverage for edge cases in datetime handling

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
tests/test_pyrenew_hew_data.py Adds extensive test coverage for datetime edge cases and refactors existing tests to use datetime.date objects
pyrenew_hew/pyrenew_hew_model.py Updates hospital admissions calculation to use get_first_week_on_or_after_t0 utility function
pyrenew_hew/pyrenew_hew_data.py Removes custom datetime functions and replaces with imports from pyrenew.time module

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

(forecast_data.first_hospital_admissions_date - data.first_data_date_overall)
/ np.timedelta64(1, "D")
).item() <= 6
).item() <= 7
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no changes to the parametrize setup, so why does this need to change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly related to test_to_forecast_data_saturday_edge_case? Which maybe indicates there was a bug fixed when migrating from pyrenew-hew functions to base pyrenew functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there was a bug.

In the old code (pyrenew_hew/pyrenew_hew_data.py:435):

  first_dow = self.first_data_date_overall.astype(dt.datetime).weekday()
  to_first_sat = (5 - first_dow) % 7  # BUG: Returns 0 when already on Saturday!
  first_mmwr_ending_date = self.first_data_date_overall + np.timedelta64(to_first_sat, "D")

In the new code:

  start_date = convert_date(self.first_data_date_overall)
  first_dow = start_date.weekday()
  days_to_first_saturday = (5 - first_dow) % 7
  if days_to_first_saturday == 0:  # FIX: If already Saturday, skip to next Saturday
      days_to_first_saturday = 7
  first_mmwr_ending_date = self.first_data_date_overall + np.timedelta64(days_to_first_saturday, "D")

The Problem:

  • When data starts on a Saturday (weekday 5), the old calculation gave: (5 - 5) % 7 = 0
  • This meant the first hospital admissions date was the same day (adding 0 days)
  • But hospital admissions are weekly data - you need at least one full week!

The Fix:

  • The new code explicitly checks: if already on Saturday, use the next Saturday (7 days later)
  • This is demonstrated by the new test test_to_forecast_data_saturday_edge_case (lines 132-159)

Why the tolerance changed:

  • Before: Worst case was Sunday → Saturday = 6 days
  • After: Worst case is Saturday → next Saturday = 7 days

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But hospital admissions are weekly data - you need at least one full week!

By this logic, we should always be jumping to the second Saturday and requiring 7 <= forecast_data.first_hospital_admissions_date - data.first_data_date_overall < 14 , no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right - this doesn't fix the bug. I updated the checks in the code as well as the unit test - there's more checking. the test is what you suggest here, except that since days are indexed from zero, the range is between 6 and 12.

Comment on lines +123 to +129
# First complete MMWR week (Sunday-Saturday) ends 6-12 days from start
# 6 days if starting on Sunday, 12 days if starting on Monday
days_diff = (
(forecast_data.first_hospital_admissions_date - data.first_data_date_overall)
/ np.timedelta64(1, "D")
).item() <= 6
).item()
assert 6 <= days_diff <= 12
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the changed code under discussion.

@damonbayer
Copy link
Collaborator

We have resolved to add the new tests first and then refactor. This will guarantee that the existing behavior is replicated.

@cdc-mitzimorris
Copy link
Contributor Author

closing this PR - will do over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants