-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RDRP- 1143_function_to_sum_emp_sample #415
base: develop
Are you sure you want to change the base?
Conversation
src/estimation/calculate_weights.py
Outdated
# Get unique references | ||
unique_refs = df["reference"].unique() | ||
|
||
# Group by cellnumber |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it would work to groupby here, as df["reference"].unique() is a list
src/estimation/calculate_weights.py
Outdated
unique_refs = unique_refs.groupby("cellnumber") | ||
|
||
# Sum employment for each cell | ||
e = unique_refs["cellnumber"]["employment"].sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can assume that we've already done the groupby. What you need to do is assume you've been passed the filtered dataframe for just one group, and simply return a number which is the sum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good start, I've put in a couple of pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@woodac , overall the function makes sense to me, but suggest a couple of small improvements
src/estimation/calculate_weights.py
Outdated
""" | ||
# Check if any of the key cols are missing | ||
cols = set(df.columns) | ||
if not ("employment" in cols) & (emp_col in cols): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just out of clarity, can I ask you to add some parentheses arond this logic? is it to be read "(not 1) and 2" "not ( 1 & 2 )", i.e.
(not ("employment" in cols) ) & (emp_col in cols)
or
not (("employment" in cols) & (emp_col in cols))
I think it's the first, but better to be unnecessarily explicit to avoid confusion
src/estimation/calculate_weights.py
Outdated
@@ -40,6 +40,28 @@ def calc_lower_n(df: pd.DataFrame, exp_col: str = "709") -> dict: | |||
return n | |||
|
|||
|
|||
def calc_lower_e(df: pd.DataFrame, emp_col: str = "711") -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this definition seems to say that the function returns a dictionary whilst it actually returns an integer?
"711", | ||
] | ||
data = [ | ||
[1, 14], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why in column 711 you chose to put 14, which is also the sum of the column, i.e. the result of the function.. it is not wrong but it's a bit confusing, since one might also think what the function does is returning the "unique non-null entry of column 711", whilst instead it returns the sum of employment".
This function calculates the sum of column "employment" for the filtered dataframe to create lower_e.
Code
Documentation
Any new code includes all the following forms of documentation:
Args
andreturns
for all major functionsData
Testing
Peer Review Section
requirements.txt
Final approval (post-review)
The author has responded to my review and made changes to my satisfaction.
Review comments
Insert detailed comments here!
These might include, but not exclusively:
that it is likely to interact with?)
works correctly?)
Your suggestions should be tailored to the code that you are reviewing.
Be critical and clear, but not mean. Ask questions and set actions.