Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature DAPS 1215 foxx UserGetAccessTokenRequest mapped collection support #1284

Open
wants to merge 32 commits into
base: devel
Choose a base branch
from

Conversation

t-ramz
Copy link
Collaborator

@t-ramz t-ramz commented Feb 4, 2025

Closes #1215

PR Description

Add support for mapped collections when using UserGetAccessTokenRequest message to fetch user tokens.
Additionally, stub support for HA collections via comments and query parameters.

Tasks

  • - A description of the PR has been provided, and a diagram included if it is a new feature.
  • - Formatter has been run
  • - CHANGELOG comment has been added
  • - Labels have been assigned to the pr
  • - A reviwer has been added
  • - A user has been assigned to work on the pr
  • - If new feature a unit test has been added

Summary by Sourcery

Add support for mapped collections to the UserGetAccessTokenRequest message.

New Features:

  • Added support for mapped collections when fetching user tokens using the UserGetAccessTokenRequest message.

Tests:

  • Added unit tests for the new mapped collection support in the user_router Foxx microservice.

Anthony Ramirez added 23 commits January 24, 2025 15:41
…UserAccessTokenReply to support finding correct tokens
…& DatabaseAPI.hpp Use updated names and types. ClientWorker.cpp Update comments.
…lientWorker.cpp Add refresh token try/catch to send through flow if refresh fails
…duced for handling logic for validating params when getting tokens and building GetAccessToken responses. user_router.js Replace logic with calls to new lib
…n. user_router.js Remove unnecessary comment.
…ken document required fields. user_token.test.js Add preliminary unit tests for static methods.
… Add simple tests for get/token endpoint. user_token.js Fix naming for token document scopes field. user_token.test.js Changes for scope field bugfix.
…API.cpp Remove TODOs. SDMS_Auth.proto Remove unused field.
…DatabaseAPI.cpp Add note on possibly missing field. user_token.test.js Update tests to check for new expected values.
@t-ramz t-ramz added Component: Database Relates to database microservice / data model Component: Core Relates to core service labels Feb 4, 2025
@t-ramz t-ramz self-assigned this Feb 4, 2025
Copy link

sourcery-ai bot commented Feb 4, 2025

Reviewer's Guide by Sourcery

This pull request adds support for mapped collections when fetching user tokens using the UserGetAccessTokenRequest message. It introduces a new UserToken class to handle token validation and formatting. The changes include modifications to the Foxx microservice, server-side logic, and web client to accommodate collection-specific tokens.

Sequence diagram for user token retrieval with collection support

sequenceDiagram
    participant Client
    participant Router
    participant UserToken
    participant Database
    participant GlobusAPI

    Client->>Router: GET /token/get
    Note over Router: Check collection params
    Router->>UserToken: validateRequestParams()
    UserToken-->>Router: collection_token status

    alt has collection token
        Router->>Database: Query token edges
        Database-->>Router: token matches
        alt token needs refresh
            Router->>GlobusAPI: refreshAccessToken()
            GlobusAPI-->>Router: new token
            Router->>Database: userSetAccessToken()
        end
    end

    Router->>UserToken: formatUserToken()
    UserToken-->>Router: formatted response
    Router-->>Client: token response
Loading

Entity relationship diagram for user tokens and collections

erDiagram
    User ||--o{ GlobusToken : has
    GlobusToken ||--o| Collection : associated_with

    User {
        string _id
        string access
        string refresh
        number expiration
    }

    GlobusToken {
        string access
        string refresh
        number expiration
        number type
        string dependent_scopes
    }

    Collection {
        string _id
        string collection_type
    }
Loading

Class diagram for UserToken implementation

classDiagram
    class UserToken {
        +validateRequestParams(query_params: object) boolean
        +formatUserToken(is_collection_token: boolean, token_document: object, needs_consent: boolean) userTokenResponse
    }

    class userTokenResponse {
        +needs_consent: boolean
        +access: string
        +refresh: string
        +expires_in: number
        +token_type: AccessTokenType
        +scopes: string
    }

    class AccessTokenType {
        <<enumeration>>
        GLOBUS_AUTH
        GLOBUS_TRANSFER
        GLOBUS_DEFAULT
        ACCESS_SENTINEL
    }

    UserToken ..> userTokenResponse : creates
    UserToken ..> AccessTokenType : uses
Loading

File-Level Changes

Change Details Files
Implements collection-specific token retrieval in the Foxx microservice.
  • Adds query parameter validation for collection ID and type.
  • Retrieves tokens based on collection ID and type.
  • Returns a needs_consent flag if a token is not found for the specified collection.
core/database/foxx/api/user_router.js
Adds a new UserToken class for handling token validation and formatting.
  • Introduces validateRequestParams to validate collection ID and type parameters.
  • Introduces formatUserToken to format the user token response, including handling the needs_consent flag.
core/database/foxx/api/lib/user_token.js
Updates the client worker to handle collection-specific token requests.
  • Adds logic to pass collection ID and type to the database API.
  • Handles token refresh for different token types.
  • Sets the needs_consent flag in the response.
core/server/ClientWorker.cpp
Modifies the DatabaseAPI to retrieve tokens based on collection ID and type.
  • Adds parameters for collection ID and type to the userGetAccessToken method.
  • Updates the database query to include collection ID and type.
core/server/DatabaseAPI.cpp
core/server/DatabaseAPI.hpp
Adds unit tests for the user router and the new UserToken class.
  • Adds tests to validate the collection ID and type parameters.
  • Adds tests to verify the retrieval of collection-specific tokens.
  • Adds tests for the UserToken class methods.
core/database/foxx/tests/user_router.test.js
core/database/foxx/tests/user_token.test.js
Updates the web client to include collection ID and type in the UserGetAccessTokenRequest message.
  • Adds logic to include collection ID and type in the message data.
  • Handles the needsConsent flag in the response.
web/datafed-ws.js

Assessment against linked issues

Issue Objective Addressed Explanation
#1215 Check if a token exists for a mapped collection and determine whether to trigger a consent flow
#1215 Only check token expiration without making unnecessary calls to Globus
#1215 Implement four specific branches for token handling with mapped collections

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @t-ramz - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider adding error logging in ClientWorker.cpp when token refresh fails, to help with debugging consent flow issues.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟡 Testing: 1 issue found
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

const std::string collection_id,
const std::string collection_type,
bool &needs_consent,
int &token_type, // TODO: use underlying type?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Use a proper enum type for token_type instead of int for better type safety

Suggested implementation:

                          bool &needs_consent,
                          enum class TokenType {
                              Bearer,
                              Basic,
                              Digest,
                              OAuth
                          } &token_type,
                          std::string &scopes, LogContext log_context);

This change will require:

  1. Updating all calls to userGetAccessToken() to use the new TokenType enum
  2. Updating the implementation of userGetAccessToken() to work with the enum
  3. If the token_type values need to match specific integer values (e.g., for protocol compatibility), you can explicitly set them in the enum like:
    enum class TokenType {
        Bearer = 1,
        Basic = 2,
        // etc.
    };

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had trouble converting from the numeric type I get as a return from JSON deserialization to the AccessTokenType that is represented here as an int. I chose to go the path of simply representing this as an int for now, which is the direct type that AccessTokenType (which is unfortunately not an enum class) compiles to in order to make the single comparison I absolutely need.

core/database/CMakeLists.txt Outdated Show resolved Hide resolved
@@ -697,6 +698,10 @@ router
router
.get("/token/get", function (req, res) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider using a direct AQL query to simplify edge filtering logic and improve performance.

The edge filtering logic can be simplified by replacing the manual filtering with a direct AQL query. This would reduce nesting and improve readability while maintaining the same functionality:

let token_document = user;
let needs_consent = false;
if (collection_token) {
    // Direct query instead of filtering edges
    const query = aql`
        FOR token IN globus_token
        FILTER token._from == ${user._id} 
        AND token._to == ${globus_collection._id}
        LIMIT 1
        RETURN token
    `;
    const token_matches = g_db._query(query).toArray();

    if (token_matches.length === 1) {
        token_document = token_matches[0];
        needs_consent = false;
    } else {
        token_document = {};
        needs_consent = true;
    }
}

This approach:

  • Eliminates edge filtering and multiple nested conditions
  • Performs filtering at the database level for better performance
  • Maintains the same functionality with clearer intent
  • Reduces cognitive complexity by flattening the logic

let needs_consent = false;
if (collection_token) {
const globus_collection = g_db.globus_coll.exists({ _key: collection_id });
// TODO: should this be a query?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative is to use

const token_matches = g_db.globus_token.byExample({
    _from: user._id
    _to: globus_collection?._id,
}).toArray();

The change would require a pre-check of globus_collection's existence, otherwise Arango will throw an error.

But this may be both more readable and more performant.

let needs_consent = false;
if (collection_token) {
const globus_collection = g_db.globus_coll.exists({ _key: collection_id });
// TODO: should this be a query?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility is the query:

const token_matches = g_db._query(
    "FOR token_match " +
    "IN globus_token " +
    "FILTER token_match._from == @user_id " +
    "   AND token_match._to == @collection_id " +
    "RETURN token_match",
    {
        user_id: user._id,
        collection_id: globus_collection._id,
    }
).toArray();

This would require no other changes to logic. I'm not sure how much more performant or readable it is, however.

Copy link
Collaborator

@AronPerez AronPerez Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m a fan of using built-in methods to accomplish tasks. Raw queries nowadays are only necessary if we can’t accomplish what we need natively with our dependencies or it’s blocking other queries (some kind of table lock etc)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Do you think that the byExample method or outEdges method and then filter on the resulting array is more readable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like byExample. Seems straightforward for what we need

const user = g_db.u.byExample({ _id: req.queryParams.subject }).toArray();  
``

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like that, and I found other uses of byExample in the Foxx API, so I think it fits in better as well. Will make that change 🫡

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been implemented, along with an additional test to look for the missing collection case!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further thought, should this return a needs_consent message when the collection does not exist?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was the old behavior. Brought that back.

} else {
try {
m_globus_api.refreshAccessToken(ref_tok, acc_tok, expires_in);
m_db_client.userSetAccessToken(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attempts to cover condition 2, "If token is in database for mapped collection, check expiration, if invalid use refresh token to create new access token, save in database, return needs_consent bool false"

if (expires_in < 300) {
if (needs_consent) {
// short circuit to reply
} else if (expires_in < 300) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attempts to cover condition 1, "If token is in database for mapped collection, check expiration if valid return token, needs_consent bool is false"

} catch (TraceException &e) { // NOTE: assumes refresh failed (invalid or
// failure). new token fetched will upsert
// and overwrite old values on database
needs_consent = true;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attempts to cover condition 3, "If token is in database for mapped collection, check expiration, if invalid refresh token fails, return needs_consent true"

} else {
// Cases covered: no matches, no tokens, no collection - all require same consent flow
token_document = {};
needs_consent = true;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attempts to cover condition 4, "If token is not in database, return needs_consent: true"

}
// TODO: account for AccessTokenType; currently only GLOBUS_DEFAULT and GLOBUS_TRANSFER are supported
token_document = token_matches[0];
needs_consent = false;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attempts to cover condition 1, "If token is in database for mapped collection, check expiration if valid return token, needs_consent bool is false"

Comment on lines +127 to +128
optional string collection_id = 1;
optional string collection_type = 2; // TODO: use enum
Copy link
Collaborator

@JoshuaSBrown JoshuaSBrown Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, do you have that in an issue?

GUEST
MAPPED
PERSONAL

We might need to explore what the options should be for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this pr though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

core/database/foxx/api/lib/user_token.js Show resolved Hide resolved
result_token.access = token_document.access;
result_token.refresh = token_document.refresh;
if (token_document.expiration) {
const expires_in = token_document.expiration - Math.floor(Date.now() / 1000);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Math.floor(Date.now() / 1000) corrospond to

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a copy/paste from https://github.com/ORNL/DataFed/blob/devel/core/database/foxx/api/user_router.js#L720

This is normalizing the data stored on the database from the expiry date/time to a time until expiry.

let needs_consent = false;
if (collection_token) {
const globus_collection = g_db.globus_coll.exists({ _key: collection_id });
// TODO: should this be a query?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like byExample. Seems straightforward for what we need

const user = g_db.u.byExample({ _id: req.queryParams.subject }).toArray();  
``

core/database/foxx/api/user_router.js Show resolved Hide resolved
required uint32 expires_in = 2; // Access token expiration in seconds
required string access = 1; // Globus access token
required uint32 expires_in = 2; // Access token expiration in seconds
required bool needs_consent = 3; // Indicate requirement of consent flow
Copy link
Collaborator

@JoshuaSBrown JoshuaSBrown Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you make this required you will break the API, can we get away with keeping this optional? And assume if it was not specified then it does not need consent.

When I say break the API, I mean that it won't be backwards compatible with how it is implemented currently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In proto3 they have actually moved to using optional by default and avoided required for this reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. With that in mind, all code that can point to a mapped collection will need to validate both the field's existence and the field's value

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it optional, do you have an idea on how we can enforce some diligence on checks?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the message comes into the core service, whether from the web server or the python client, it will go through the DatabaseAPI class. That would be a good place.

@@ -22,6 +22,7 @@ if( ENABLE_FOXX_TESTS )
add_test(NAME foxx_support COMMAND "${CMAKE_CURRENT_SOURCE_DIR}/tests/test_foxx.sh" -t "unit_support")
add_test(NAME foxx_user_router COMMAND "${CMAKE_CURRENT_SOURCE_DIR}/tests/test_foxx.sh" -t "unit_user_router")
add_test(NAME foxx_authz_router COMMAND "${CMAKE_CURRENT_SOURCE_DIR}/tests/test_foxx.sh" -t "unit_authz_router")
add_test(NAME foxx_unit_user_token COMMAND "${CMAKE_CURRENT_SOURCE_DIR}/tests/test_foxx.sh" -t "unit_user_token")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Anthony Ramirez added 4 commits February 7, 2025 12:11
…ng Globus collection. Add globus_coll fixture.
… collection tokens are present. user_router.test.js Update test from error to needs_consent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Core Relates to core service Component: Database Relates to database microservice / data model
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DAPS, Foxx, Core] - Feature, adjust UserGetAccessRequest to work with Mapped Collection UUID
3 participants