-
Notifications
You must be signed in to change notification settings - Fork 568
Remove user from all classrooms tied to a GH organization that has revoked their access #2480
base: master
Are you sure you want to change the base?
Conversation
expect(assignment.reload.creator_id).not_to eq(@user.id) | ||
end | ||
|
||
it "deletes user from multiple organizations (classrooms) mapped to the same GitHub organization" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof this is another one of those times when I wish we didn't have 2 different concepts of "organization" in the backend :C
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I can't decide if it's going to get better or worse if we move into the monolith.
Should we run a migration/transition to clean up existing user/classroom associations that should have been removed already? |
@spinecone probably...since it's triggered by a webhook when someone's removed from the GitHub organization, I don't think there's a simple way of doing it. We'll likely need to iterate through every user/github_organization combo and make sure the user still has access (though I'm definitely open to a cleaner way of doing it). Should that be part of this PR so we know when/why it was added? Or should that come in a follow on PR? |
I don't know of a cleaner way than iterating through every user (or organization, whichever has fewer records), I would just make sure to use as few github.com API requests as possible so we don't hit any rate limits. It seems cleaner to me to have everything in this one PR, but I defer to what makes the most sense to you. |
@spinecone I prefer having things together, but I end up getting a lot of feedback that my PRs are too big, so, I'm probably overly cautious now. I'll get it added in and re-ping for review. |
We attepted to start with orgs as there were fewer orgs than users and this seemed like a way to limit the amount of api calls. However, in order to check if someone is an admin on an org, we have to make an api call to that org authenticated as someone who is already an admin. That would require going through users in that org to see if one is an admin before running through the remaining users to see if they should be removed or not. It seems simpler now to start with a user and iterate through their orgs (removing them from any they're not an admin for).
…em if they aren't.
a3865bd
to
36bae66
Compare
|
||
github_org_ids.each do |gh_id| | ||
github_org = GitHubOrganization.new(user.github_client, gh_id) | ||
next if github_org.admin?(user.github_login) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love a double check on this line. If this returns true, I want to move on to the next iteration. I only want to move forward to the payload line if github_org.admin?(user.github_login)
returns false. I'm 99% sure I've written it right, but, you can never be too sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true that admins cannot have their access to an organization revoked? I would think we would want to respect the right of an organization to restrict access to their repositories from anyone they want.
@spinecone this should be ready to review again. I've added a rake task that will go through and check for any records that may need to be cleaned up. It relies on the client for the user that is being checked, so, we shouldn't end up hitting any API limits. I went with a rake task instead of a migration as this may be something we need to run again in the future. Also, no need to rush the check on this. I won't be deploying it this late on a holiday weekend and I won't be back until Tuesday, so there's plenty of time. |
def perform(payload_body) | ||
return true unless payload_body.dig("action") == "member_removed" | ||
|
||
github_user_id = payload_body.dig("membership", "user", "id") | ||
github_organization_id = payload_body.dig("organization", "id") | ||
@organization = Organization.find_by(github_id: github_organization_id) | ||
organizations = Organization.where(github_id: github_organization_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need an .all?
It feels like this will return a scope and the subsequent check for empty?
will always be false. Unless that needs to be an any?
or .count == 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely want the collection here, so, we shouldn't tack the .all?
on, otherwise, we'll be iterating over a boolean on line 18.
I think empty?
does work here though. That query returns an ActiveRecord::Relation. If you run .empty?
on an AR relation, it behaves as expected (if there are no records in it, it returns true, else false).
But, it seems like .empty?
might be a non-standard thing to use here. I ended up using it because I had a vague recollection that one of the checks was less effecient than the others. Based on this post it seemed .empty?
was the way to go. That said, it's unlikely we'll be in the situation where this type of optimization will truly matter, so if .any?
would be the more common thing to use here, I'm happy to update the check on line 16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, if it works then it's good! Either way. I realized I had a formatting error above: .all?
was supposed to be .all
? (i.e., the question mark wasn't meant to be part of the method, but was definitely formatted that way 😭)
|
||
@user = @organization.users.find_by(uid: github_user_id) | ||
failed_removals = organizations.reject do |org| | ||
user = org.users.find_by(uid: github_user_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how many orgs we will generally have or if there is a way to avoid this N+1. It seems like it will be negligible and in a job... so... not a big deal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried fixing this by using an includes where I grab the organizations above:
organizations = Organization.where(github_id: github_organization_id).includes(:users)
However, I end up with an error from the bullet gem:
Bullet::Notification::UnoptimizedQueryError:
user: jessrudder
AVOID eager loading detected
Organization => [:users]
Remove from your finder: :includes => [:users]
Call stack
/Users/jessrudder/github/classroom/spec/support/bullet.rb:16:in `block (2 levels) in <top (required)>'
bin/rspec:16:in `load'
bin/rspec:16:in `<main>'
If an includes
would be the right thing to do here to avoid the N+1, I can whitelist this for the bullet gem so it doesn't unfairly fail the test. If not, is there another way to avoid the N+1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm I think this is probably okay as an N+1, I don't think we will have too many of these. And I realized it is just run by the rake task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
Addresses to https://github.com/github/education/issues/988
Our previous code assumed a 1-to-1 mapping of GitHub organizations to classrooms. However, many of our teachers are using a single organization on GitHub as a source for multiple classrooms.
This update ensures that when access is revoked for a GitHub organization for a user, the user is removed from all of the associated classrooms as well.
Previously we would not remove classroom access if a user was the only user with admin access to the classroom. With this update, we will be removing access regardless. In a small number of cases, this may result in admin-less classrooms.