-
Notifications
You must be signed in to change notification settings - Fork 3
Move general datetime indexing functions from pyrenew-hew to PyRenew #709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #709 +/- ##
===========================================
- Coverage 26.08% 14.82% -11.26%
===========================================
Files 26 22 -4
Lines 2427 1862 -565
===========================================
- Hits 633 276 -357
+ Misses 1794 1586 -208
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…yrenew-hew into mem_refactor_datetime_indexing
- Replace date-to-model-time list comprehensions with align_observation_times() - Fix Saturday edge case in to_forecast_data() to correctly add 7 days when start date is already a Saturday - Replace complex ceiling division with get_first_week_on_or_after_t0() utility - Add test for Saturday edge case - Update test assertion to allow up to 7 days offset for MMWR week alignment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
for more information, see https://pre-commit.ci
- Rename date properties (e.g., dates_observed_ed_visits → ed_dates) - Rename first/last date properties (e.g., first_ed_visits_date → ed_first_date) - Rename observation data (e.g., data_observed_disease_ed_visits → ed_visit_count) - Rename time indices (e.g., model_t_obs_ed_visits → ed_obs_time) - Remove model_ prefix from time variables (e.g., model_t_first_latent_infection → t_first_infection) - Rename array indices (e.g., which_obs_ed_visits → ed_obs_idx) - Change to singular form (e.g., ww_observed_subpops → ww_obs_subpop) - Update all tests accordingly - All 49 tests passing
for more information, see https://pre-commit.ci
…or_datetime_indexing
for more information, see https://pre-commit.ci
…or_datetime_indexing
…yrenew-hew into mem_refactor_datetime_indexing
|
Hey @cdc-mitzimorris , The % uv sync --upgrade-package pyrenew commit it and I think the CI will work (at least the local python tests did on my machine). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR moves general datetime indexing and time handling utilities from the pyrenew-hew package to the PyRenew core library to enable code reuse across projects. It replaces custom datetime manipulation logic with standardized functions from PyRenew's time module.
Key changes:
- Replaces custom date validation and time indexing functions with imports from
pyrenew.time - Updates hospital admissions date calculation logic to use centralized utilities
- Adds comprehensive test coverage for edge cases in datetime handling
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/test_pyrenew_hew_data.py | Adds extensive test coverage for datetime edge cases and refactors existing tests to use datetime.date objects |
| pyrenew_hew/pyrenew_hew_model.py | Updates hospital admissions calculation to use get_first_week_on_or_after_t0 utility function |
| pyrenew_hew/pyrenew_hew_data.py | Removes custom datetime functions and replaces with imports from pyrenew.time module |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
…or_datetime_indexing
for more information, see https://pre-commit.ci
tests/test_pyrenew_hew_data.py
Outdated
| (forecast_data.first_hospital_admissions_date - data.first_data_date_overall) | ||
| / np.timedelta64(1, "D") | ||
| ).item() <= 6 | ||
| ).item() <= 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no changes to the parametrize setup, so why does this need to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly related to test_to_forecast_data_saturday_edge_case? Which maybe indicates there was a bug fixed when migrating from pyrenew-hew functions to base pyrenew functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, there was a bug.
In the old code (pyrenew_hew/pyrenew_hew_data.py:435):
first_dow = self.first_data_date_overall.astype(dt.datetime).weekday()
to_first_sat = (5 - first_dow) % 7 # BUG: Returns 0 when already on Saturday!
first_mmwr_ending_date = self.first_data_date_overall + np.timedelta64(to_first_sat, "D")
In the new code:
start_date = convert_date(self.first_data_date_overall)
first_dow = start_date.weekday()
days_to_first_saturday = (5 - first_dow) % 7
if days_to_first_saturday == 0: # FIX: If already Saturday, skip to next Saturday
days_to_first_saturday = 7
first_mmwr_ending_date = self.first_data_date_overall + np.timedelta64(days_to_first_saturday, "D")
The Problem:
- When data starts on a Saturday (weekday 5), the old calculation gave: (5 - 5) % 7 = 0
- This meant the first hospital admissions date was the same day (adding 0 days)
- But hospital admissions are weekly data - you need at least one full week!
The Fix:
- The new code explicitly checks: if already on Saturday, use the next Saturday (7 days later)
- This is demonstrated by the new test test_to_forecast_data_saturday_edge_case (lines 132-159)
Why the tolerance changed:
- Before: Worst case was Sunday → Saturday = 6 days
- After: Worst case is Saturday → next Saturday = 7 days
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But hospital admissions are weekly data - you need at least one full week!
By this logic, we should always be jumping to the second Saturday and requiring 7 <= forecast_data.first_hospital_admissions_date - data.first_data_date_overall < 14 , no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right - this doesn't fix the bug. I updated the checks in the code as well as the unit test - there's more checking. the test is what you suggest here, except that since days are indexed from zero, the range is between 6 and 12.
This reverts commit ed259b0.
| # First complete MMWR week (Sunday-Saturday) ends 6-12 days from start | ||
| # 6 days if starting on Sunday, 12 days if starting on Monday | ||
| days_diff = ( | ||
| (forecast_data.first_hospital_admissions_date - data.first_data_date_overall) | ||
| / np.timedelta64(1, "D") | ||
| ).item() <= 6 | ||
| ).item() | ||
| assert 6 <= days_diff <= 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the changed code under discussion.
|
We have resolved to add the new tests first and then refactor. This will guarantee that the existing behavior is replicated. |
|
closing this PR - will do over. |
This PR is the companion to CDCgov/PyRenew#600 which adds general utilities for handing time indexing and time slicing to PyRenew by identifying code which was implemented in pyrenew-hew and is a good candidate for reuse.