Skip to content

Conversation

@dylanhmorris
Copy link
Contributor

Closes #810 and adds a test that would have caught it.

@codecov
Copy link

codecov bot commented Dec 30, 2025

Codecov Report

❌ Patch coverage is 90.32258% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 42.37%. Comparing base (e84032d) to head (7ac2807).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pipelines/prep_eval_data.py 0.00% 2 Missing ⚠️
pipelines/prep_data.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #811      +/-   ##
==========================================
+ Coverage   41.82%   42.37%   +0.54%     
==========================================
  Files          30       30              
  Lines        2809     2834      +25     
==========================================
+ Hits         1175     1201      +26     
+ Misses       1634     1633       -1     
Flag Coverage Δ
hewr 37.97% <ø> (ø)
pipelines 37.33% <90.32%> (+1.24%) ⬆️
pyrenew_hew 62.29% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dylanhmorris dylanhmorris requested a review from Copilot December 30, 2025 00:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in the data preparation pipeline by refactoring duplicated pivot/filter logic into a reusable clean_nssp_data function. The fix ensures that the last_data_date filter is applied before pivoting the data, which resolves issue #810. A comprehensive parametrized test is added to validate the fix and prevent regression.

Key changes:

  • Introduces clean_nssp_data function to centralize NSSP data cleaning logic
  • Refactors prep_eval_data.py and prep_data.py to use the new function
  • Adds parametrized test coverage for clean_nssp_data with various edge cases

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pipelines/prep_data.py Adds the new clean_nssp_data function and refactors process_and_save_loc_data to use it
pipelines/prep_eval_data.py Refactors save_eval_data to use the new clean_nssp_data function
pipelines/tests/test_prep_data.py Adds comprehensive parametrized test for clean_nssp_data function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@sbidari sbidari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
(accepted one copilot suggestion for typo)

@dylanhmorris dylanhmorris enabled auto-merge (squash) December 30, 2025 00:50
@dylanhmorris dylanhmorris merged commit 46529b9 into main Dec 30, 2025
21 checks passed
@dylanhmorris dylanhmorris deleted the dhm-hotfix-prep-data-bug branch December 30, 2025 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

other_ed_visit computation in prep_data.py fails to subtract target ed visits

3 participants