Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add source permission metadata for google drive #415

Merged
merged 14 commits into from
Apr 1, 2025

Conversation

shreyanid
Copy link
Contributor

@shreyanid shreyanid commented Mar 6, 2025

This PR retrieves the permissions available in the file data of each google drive document and normalizes the format into lists of users and groups with each type of role (read/update/delete etc).
Permission data was already being emitted from this connector, so this PR is just the type normalization step in the indexer.

@shreyanid shreyanid force-pushed the add_source_permission_metadata branch from 0826ce3 to d4ad155 Compare March 27, 2025 15:52
@shreyanid shreyanid force-pushed the add_source_permission_metadata branch from d4ad155 to 5d4a96f Compare April 1, 2025 02:03
Copy link
Contributor

@vangheem vangheem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces functionality to normalize permission metadata for Google Drive documents by transforming raw permission data into structured dictionaries keyed by operation type.

  • Added permission normalization in both directory and file processing.
  • Updated version number and changelog to reflect new functionality.

Reviewed Changes

Copilot reviewed 4 out of 8 changed files in this pull request and generated 1 comment.

File Description
unstructured_ingest/processes/connectors/google_drive.py Implements extraction and normalization of permission data
unstructured_ingest/version.py Increments version number
CHANGELOG.md Documents the new feature
Files not reviewed (4)
  • test/integration/connectors/expected_results/google_drive_source/file_data/1r-RDeDtKprFQWST4PCIPV618y_sBL7N7EEWg7q4kZrU.json: Language not supported
  • test_e2e/expected-structured-output/google-drive/fake.docx.json: Language not supported
  • test_e2e/expected-structured-output/google-drive/nested/fake.docx.json: Language not supported
  • test_e2e/expected-structured-output/google-drive/test-drive-doc.docx.json: Language not supported
Comments suppressed due to low confidence (1)

unstructured_ingest/processes/connectors/google_drive.py:409

  • The assignment on this line uses ':' instead of '=', which is likely a syntax error. Consider replacing ':' with '=' for proper assignment.
d.metadata.record_locator["drive_id"]: object_id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants