-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add source permission metadata for google drive #415
Conversation
0826ce3
to
d4ad155
Compare
d4ad155
to
5d4a96f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces functionality to normalize permission metadata for Google Drive documents by transforming raw permission data into structured dictionaries keyed by operation type.
- Added permission normalization in both directory and file processing.
- Updated version number and changelog to reflect new functionality.
Reviewed Changes
Copilot reviewed 4 out of 8 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
unstructured_ingest/processes/connectors/google_drive.py | Implements extraction and normalization of permission data |
unstructured_ingest/version.py | Increments version number |
CHANGELOG.md | Documents the new feature |
Files not reviewed (4)
- test/integration/connectors/expected_results/google_drive_source/file_data/1r-RDeDtKprFQWST4PCIPV618y_sBL7N7EEWg7q4kZrU.json: Language not supported
- test_e2e/expected-structured-output/google-drive/fake.docx.json: Language not supported
- test_e2e/expected-structured-output/google-drive/nested/fake.docx.json: Language not supported
- test_e2e/expected-structured-output/google-drive/test-drive-doc.docx.json: Language not supported
Comments suppressed due to low confidence (1)
unstructured_ingest/processes/connectors/google_drive.py:409
- The assignment on this line uses ':' instead of '=', which is likely a syntax error. Consider replacing ':' with '=' for proper assignment.
d.metadata.record_locator["drive_id"]: object_id
This PR retrieves the permissions available in the file data of each google drive document and normalizes the format into lists of users and groups with each type of role (read/update/delete etc).
Permission data was already being emitted from this connector, so this PR is just the type normalization step in the indexer.