Streamline dataset access checks #479

stijn-uva · 2025-02-06T16:41:07Z

There are two methods to determine if a given user is allowed to access a dataset: User.can_access_dataset() and DataSet.is_accessible_by(). The former is a wrapper for the latter with some extra checks. They are now used more or less interchangeably but since they operate slightly differently, this means it's not always clear exactly what check is being done and whether it is the right one. Consolidating both into a single method or making one purely a wrapper for the other without extra checks is preferable.

The text was updated successfully, but these errors were encountered:

dale-wahl · 2025-02-13T10:33:22Z

Issue #481 is related.

This seems less trivial after looking into it. We have two types of access "viewer" and "owner" with the difference being, as I understand it, that a viewer cannot manipulate the dataset.
Dataset.is_accessible_by() checks the user's role related to the dataset (either checks for owner or viewer though maybe we can to be able to also check for either/multiple).
User.can_access_dataset() I think it is more can a user view a dataset, since it returns True immediately if a dataset is not private (and presumably that does not mean you can also manipulate it? but maybe you can do some things like run a processor but not delete it?).

There are also some related settings:

privileges.can_view_private_datasets implies that a user will have viewer status of all datasets (including private)
privileges.admin.can_manipulate_all_datasets implies that a user essentially acts as owner of all datasets (including private)
(maybe) privileges.can_view_all_datasets which states it is specifically to view the Global datasets index/list (not necessarily the datasets themselves)

Long story short: I wanted to fix the referenced issue (and could just modify User.can_access_dataset() to check for privileges.can_view_private_datasets and, potentially, privileges.admin.can_manipulate_all_datasets). But I think for this issue, we may want some greater clarity.

Maybe we want two functions DataSet.can_be_viewed_by(user) and DataSet.can_be_manipulated_by(user) and remove the User function? Maybe processors could be run by "viewers" if they are additive (i.e., not something like anonymize which effects the original dataset).

stijn-uva added the low priority label Feb 6, 2025

dale-wahl mentioned this issue Feb 17, 2025

"Can manipulate dataset" privilege does not seem to work on the user level (user cannot manipulate other datasets) #481

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streamline dataset access checks #479

Streamline dataset access checks #479

stijn-uva commented Feb 6, 2025

dale-wahl commented Feb 13, 2025

Streamline dataset access checks #479

Streamline dataset access checks #479

Comments

stijn-uva commented Feb 6, 2025

dale-wahl commented Feb 13, 2025