-
Notifications
You must be signed in to change notification settings - Fork 1
Hc dol trim gender update #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@s-french to review |
s-french
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting an error in line 88, saying unit 'M' is not supported. Having a google, 'm' is minutes. Not sure how this is worked before, but chatgpt is suggesting the following:
✅ Solution: Use pd.DateOffset or divide by approximate month length
🔹 Option 1: Use .dt.to_period() or .dt.months logic (recommended if you're comparing calendar months)
If you're trying to get the number of full months between two dates:
dol_table['ordspan_months'] = (
(dol_table['ordfindate'].dt.to_period('M') - dol_table['orderdate'].dt.to_period('M'))
).apply(lambda x: x.n)
Explanation:
.dt.to_period('M') converts dates to monthly periods (e.g., 2024-01).
Subtracting periods gives a MonthEndOffset, which .n converts to an integer.
This gives calendar-month difference, rounded down (e.g. Jan 31 to Feb 1 = 1 month).
|
@s-french Line 88 error corrected and added code to redact any genders that is not male/female and group under "other" |
|
Checked new CSV without region column against old output, same results in QA file, left a copy in the HC DOL update folder - "DoL Table 2025 Q3 - region removed" |
s-french
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @alee5646, have looked and it runs fine so good work fixing that M issue. There are a few things below that need addressing as really the code should be clean re region/gender rather than fixes added towards the end. We've been the victim of code where patches are applied and it easily builds up to get very unclear code so would appreciate this file being cleaned up as outlined below
Code chunk 7 (importing region lookup) is redundant so should be deleted
Code chunk 8 doesn't need to join to the region lookup so needs amending
Ditto chunk 9
Chunks 10-12,14-17 refers to gender rather than sex - can now see this is dealt with in chunk 20. Can't say the shortcut is preferred, better to fix from the start (a ctrl+f should find all references easily)
Chunk 12 still includes 'child region of DoL' - suspect this isn't needed (subsequent queries may need adapting accordingly)
|
Also, just noted that the Q3 version of the csv still has region in it so needs fixing |
I was hanging on swapping the final CSV file, just in the new one needed changes, have swapped the final CSV over now |
|
Removed party_type from output, commented out region lookup join for now, will come back and tidy up areas |
s-french
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all looks good for the DoLs workbook thanks
There are a number of divorce checkpoint files in this PR as well - I don't know the impact of merging these in tbh so can you remove them please? I'll then merge once clear (possibly a result of using 'git add .' which adds all files changed in that session rather than naming the specific files changed?)
f266c6a to
3155ee4
Compare
3155ee4 to
ab80904
Compare
|
Yea I think I accidently did git add data and pushed the whole folder instead of just the DoL folder, little bit of a pain to get rid of the checkpoint files but figured out a method in the end - stash dol changes and merge development into the branch (make sure local is up to date!) and bring the stashed dol changes back in, everything looks ok now with only changes in the DoL folder |
Summary
Files updated:
Reason for change: What was the problem?
Gender to sex update in column name
Changes made:
Checks for creator
Delete all non-relevant lines below, and then tick off as completed:
Closing issues
Replace the XXX with the issue numbers for any issues that can now be closed:
Close #XXX
Checks for reviewer
Creator - delete any not relevant. Reviewer - tick as completed: