-
Notifications
You must be signed in to change notification settings - Fork 3
Fix prep data bug #811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix prep data bug #811
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #811 +/- ##
==========================================
+ Coverage 41.82% 42.37% +0.54%
==========================================
Files 30 30
Lines 2809 2834 +25
==========================================
+ Hits 1175 1201 +26
+ Misses 1634 1633 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a bug in the data preparation pipeline by refactoring duplicated pivot/filter logic into a reusable clean_nssp_data function. The fix ensures that the last_data_date filter is applied before pivoting the data, which resolves issue #810. A comprehensive parametrized test is added to validate the fix and prevent regression.
Key changes:
- Introduces
clean_nssp_datafunction to centralize NSSP data cleaning logic - Refactors
prep_eval_data.pyandprep_data.pyto use the new function - Adds parametrized test coverage for
clean_nssp_datawith various edge cases
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pipelines/prep_data.py | Adds the new clean_nssp_data function and refactors process_and_save_loc_data to use it |
| pipelines/prep_eval_data.py | Refactors save_eval_data to use the new clean_nssp_data function |
| pipelines/tests/test_prep_data.py | Adds comprehensive parametrized test for clean_nssp_data function |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
(accepted one copilot suggestion for typo)
Co-authored-by: Copilot <[email protected]>
Closes #810 and adds a test that would have caught it.