Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamline dataset access checks #479

Open
stijn-uva opened this issue Feb 6, 2025 · 1 comment
Open

Streamline dataset access checks #479

stijn-uva opened this issue Feb 6, 2025 · 1 comment

Comments

@stijn-uva
Copy link
Member

There are two methods to determine if a given user is allowed to access a dataset: User.can_access_dataset() and DataSet.is_accessible_by(). The former is a wrapper for the latter with some extra checks. They are now used more or less interchangeably but since they operate slightly differently, this means it's not always clear exactly what check is being done and whether it is the right one. Consolidating both into a single method or making one purely a wrapper for the other without extra checks is preferable.

@dale-wahl
Copy link
Member

Issue #481 is related.

This seems less trivial after looking into it. We have two types of access "viewer" and "owner" with the difference being, as I understand it, that a viewer cannot manipulate the dataset.
Dataset.is_accessible_by() checks the user's role related to the dataset (either checks for owner or viewer though maybe we can to be able to also check for either/multiple).
User.can_access_dataset() I think it is more can a user view a dataset, since it returns True immediately if a dataset is not private (and presumably that does not mean you can also manipulate it? but maybe you can do some things like run a processor but not delete it?).

There are also some related settings:

  • privileges.can_view_private_datasets implies that a user will have viewer status of all datasets (including private)
  • privileges.admin.can_manipulate_all_datasets implies that a user essentially acts as owner of all datasets (including private)
  • (maybe) privileges.can_view_all_datasets which states it is specifically to view the Global datasets index/list (not necessarily the datasets themselves)

Long story short: I wanted to fix the referenced issue (and could just modify User.can_access_dataset() to check for privileges.can_view_private_datasets and, potentially, privileges.admin.can_manipulate_all_datasets). But I think for this issue, we may want some greater clarity.

Maybe we want two functions DataSet.can_be_viewed_by(user) and DataSet.can_be_manipulated_by(user) and remove the User function? Maybe processors could be run by "viewers" if they are additive (i.e., not something like anonymize which effects the original dataset).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants