-
Notifications
You must be signed in to change notification settings - Fork 17
fix: (CDK) (Declarative) - Add Manifest Migration
module
#485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
📝 WalkthroughWalkthroughThis change introduces a new manifest migration framework for Airbyte's CDK, enabling systematic upgrades and transformations of declarative source manifests. It adds migration base classes, migration implementations, a migration registry, and a migration handler for orchestrating and recording applied migrations. The system is integrated into the manifest resolution flow, with new tests and fixtures verifying the correctness of migrations such as URL field consolidation and request body key unification. Documentation is added to guide contributors on creating and registering new migrations. The connector builder handler is updated to support conditional manifest migration based on configuration. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ConnectorBuilderHandler
participant ManifestDeclarativeSource
participant ManifestMigrationHandler
participant MigrationRegistry
User->>ConnectorBuilderHandler: create_source(config)
ConnectorBuilderHandler->>ManifestDeclarativeSource: instantiate (migrate_manifest flag)
ManifestDeclarativeSource->>ManifestMigrationHandler: _migrate_manifest() (if migrate_manifest)
ManifestMigrationHandler->>MigrationRegistry: Get registered migrations
loop For each migration in order
ManifestMigrationHandler->>ManifestMigrationHandler: Apply migration if applicable
ManifestMigrationHandler->>ManifestMigrationHandler: Record migration trace
end
ManifestMigrationHandler-->>ManifestDeclarativeSource: Return migrated manifest
ManifestDeclarativeSource-->>ConnectorBuilderHandler: Source ready
Suggested labels
Suggested reviewers
Would you like to add more migration examples to the documentation or expand test coverage for additional manifest scenarios, wdyt? 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms (9)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (17)
airbyte_cdk/manifest_migrations/migrations_registry.py (3)
13-17
: Nice dynamic importing logic!
This approach ensures that adding new migration files is painless. Would you consider adding a debug log to confirm which modules are loaded, wdyt?
19-25
: Ordering key function looks sensible!
If someone forgets to add the double underscore with an integer suffix, it defaults to 0. Would you like to raise a warning in that case, wdyt?
27-51
: Discovery function is thorough!
It ensures all migration classes are collected and avoids duplicates. Maybe you'd like to unify the loops to reduce repetition or log each discovered class for clarity, wdyt?airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py (1)
31-43
: URL construction logic is concise!
Perhaps consider logging or warning the user when theurl
is changed, wdyt?airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py (3)
15-17
: Well-defined class attributes!The component type and key names are clearly defined as class attributes, making the code more maintainable. Would using constants for these values be more consistent with the rest of the codebase, wdyt?
19-22
: Effective implementation of should_migrate!The method correctly checks if the component type matches and if the original key exists in the manifest. One suggestion: would it be clearer to check
self.original_key in manifest
directly rather than converting keys to a list first, wdyt?- def should_migrate(self, manifest: ManifestType) -> bool: - return manifest[TYPE_TAG] == self.component_type and self.original_key in list( - manifest.keys() - ) + def should_migrate(self, manifest: ManifestType) -> bool: + return manifest[TYPE_TAG] == self.component_type and self.original_key in manifest
24-26
: Simple and effective migration implementation!The migration logic is clean and straightforward - copy the value from the original key to the replacement key and remove the original. Consider adding a null check before accessing
manifest[self.original_key]
for extra safety, wdyt?airbyte_cdk/manifest_migrations/README.md (3)
5-20
: Clear instructions for adding new migrations!The step-by-step instructions for adding new migrations are comprehensive and well-structured. There appears to be an extra space after the period in item 2, line 19 - might want to fix that for consistency, wdyt?
🧰 Tools
🪛 LanguageTool
[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...(UNLIKELY_OPENING_PUNCTUATION)
36-40
: Concise testing guidelines!The testing guidelines are clear and to the point. Would it be helpful to include a link or example of test file structure to guide developers further, wdyt?
41-57
: Excellent example migration skeleton!The example skeleton is thorough and helps developers understand how to implement their own migrations. Minor note: the import path in the example seems to be from
airbyte_cdk.sources.declarative.migrations.manifest.manifest_migration
but in the actual code it's fromairbyte_cdk.manifest_migrations.manifest_migration
. Should we update the example to match the actual path, wdyt?airbyte_cdk/manifest_migrations/migration_handler.py (1)
49-62
: Good encapsulation of single migration handling!The
_handle_migration
method cleanly handles a single migration and wraps any exceptions in aManifestMigrationException
. One suggestion: would it make sense to log the exception before re-raising it for better debugging, wdyt?unit_tests/manifest_migrations/test_manifest_migration.py (1)
17-30
: Nice test coverage fortest_manifest_resolve_migrate
.
Would you consider adding a negative test scenario with missingpath
orurl_base
to ensure graceful handling of partial migrations if the manifest is incomplete? wdyt?airbyte_cdk/sources/declarative/manifest_declarative_source.py (2)
74-84
: Great addition ofmigrate_manifest
parameter.
Would you consider clarifying in the docstring that the migrations might update or remove certain manifest fields? wdyt?
100-104
: Consider capturing migration failures.
Right now, if.apply_migrations()
raises an exception, do we want to provide a user-friendly error message or fallback? wdyt?unit_tests/manifest_migrations/conftest.py (1)
10-195
: Comprehensive fixture formanifest_with_url_base_to_migrate_to_url
.
Would it be helpful to break this fixture into multiple smaller fixtures for better test clarity? wdyt?airbyte_cdk/manifest_migrations/manifest_migration.py (2)
17-37
: Nicely structured abstract class for manifest migrations.
Would you consider providing a defaultshould_migrate
logic that checks the class’s version vs the manifest version, so that each migration doesn't need to re-implement if the logic is standard? wdyt?
69-108
: Recursive manifest processing looks robust.
Are you concerned about performance with deeply nested manifests? Possibly we could short-circuit or use a queue-based approach. wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
airbyte_cdk/manifest_migrations/README.md
(1 hunks)airbyte_cdk/manifest_migrations/exceptions.py
(1 hunks)airbyte_cdk/manifest_migrations/manifest_migration.py
(1 hunks)airbyte_cdk/manifest_migrations/migration_handler.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations_registry.py
(1 hunks)airbyte_cdk/sources/declarative/manifest_declarative_source.py
(3 hunks)airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py
(2 hunks)unit_tests/manifest_migrations/conftest.py
(1 hunks)unit_tests/manifest_migrations/test_manifest_migration.py
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
airbyte_cdk/manifest_migrations/migration_handler.py (2)
airbyte_cdk/manifest_migrations/exceptions.py (1)
ManifestMigrationException
(6-12)airbyte_cdk/manifest_migrations/manifest_migration.py (2)
ManifestMigration
(17-137)_process_manifest
(69-107)
airbyte_cdk/manifest_migrations/migrations_registry.py (1)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
ManifestMigration
(17-137)
🪛 LanguageTool
airbyte_cdk/manifest_migrations/README.md
[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...
(UNLIKELY_OPENING_PUNCTUATION)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (19)
airbyte_cdk/manifest_migrations/exceptions.py (1)
6-12
: Exception class looks good!
This custom exception provides a clear message for manifest migration errors. Maybe you could document themessage
parameter in the docstring for completeness, wdyt?airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py (1)
22-25
: Logic for detecting migratable components looks clean!
No issues here.airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py (2)
1-6
: Clean imports and good organization!The imports are well-organized and focused, bringing in only the necessary components from the manifest migration framework. Nice job keeping this focused.
8-13
: Clear docstring explaining the migration purpose!The docstring clearly explains what this migration does: converting
url_base
tourl
in HttpRequester components. This will be helpful for other developers understanding the code's purpose.airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py (2)
7-7
: Updated import for more specific typing!Good addition of the Dict type to the imports, making the code more explicit.
98-98
: Improved return type specificity!Nice improvement changing the return type from
Mapping[str, Any]
toDict[str, Any]
. This more accurately represents the actual return value (a dictionary), which helps with static type checking.airbyte_cdk/manifest_migrations/README.md (4)
1-4
: Great introduction to manifest migrations!The introduction clearly explains the purpose of manifest migrations in the context of the Airbyte CDK.
21-24
: Good explanation of versioning!Clear explanation of how migration versioning works and when migrations will be applied.
31-35
: Helpful information about the migration registry!The explanation of how migrations are automatically discovered and registered is valuable for developers. It ensures they don't manually modify the registry.
59-66
: Helpful additional notes section!Good inclusion of additional notes and reference to documentation in other files. This helps guide developers to find more information when needed.
airbyte_cdk/manifest_migrations/migration_handler.py (4)
1-4
: Standard header with copyright information.The file follows the standard header format with appropriate copyright information.
5-17
: Well-organized imports!The imports are properly structured, importing specific classes and organizing them by module.
20-28
: Good handler initialization with defensive copying!I like that you're creating a deep copy of the manifest in the constructor. This preserves the original state in case of errors during migration.
29-47
: Well-implemented migration application with error handling!The method handles migrations sequentially and properly catches exceptions to return the original manifest on failure. The docstring is thorough and explains the behavior well.
unit_tests/manifest_migrations/test_manifest_migration.py (1)
32-47
: Idempotency test is clear and concise.
Everything looks good. No issues found here.airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
18-20
: Migration handler import recognized.
No concerns here.unit_tests/manifest_migrations/conftest.py (2)
197-494
: Fixtureexpected_manifest_with_url_base_migrated_to_url
is well-structured.
Everything is consistent and thorough.
496-601
:manifest_with_migrated_url_base_and_path_is_joined_to_url
fixture is effectively verifying idempotent migrations.
No issues found.airbyte_cdk/manifest_migrations/manifest_migration.py (1)
118-138
: Regex-based version extraction is straightforward.
No issues here.
airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
airbyte_cdk/manifest_migrations/manifest_migration.py (2)
22-29
: Remove or update the "kwargs" references in the docstrings?In both
should_migrate
(lines 22-29) andmigrate
(lines 33-39), the docstrings mentionkwargs
, but the methods themselves do not accept such parameters. Would you consider removing or clarifying these references to match the actual function signatures, wdyt?Also applies to: 33-39
68-108
: Consider protecting against cyclic references in manifests?Recursively processing a nested dictionary or list can be risky if there's ever a possibility of cyclical references. This could theoretically cause an infinite loop. Would you like to add a safeguard (e.g., tracking visited nodes) or confirm that the manifests never contain cycles, wdyt?
airbyte_cdk/manifest_migrations/migration_handler.py (3)
43-53
: Improve user feedback on migration failure?Right now, if any migration fails, you return the original manifest without explicit logging or messaging to the user about which migration failed. Would you consider adding a more direct log statement or returning additional diagnostic info so the user knows a particular migration was reverted, wdyt?
74-75
: Consider catching only known migration exceptions?Here, the code catches all exceptions before re-raising them as
ManifestMigrationException
. Do we anticipate unexpected exceptions (likeKeyError
or network errors)? Would you like to narrow the exception scope to known issues or at least log the error class name, wdyt?
84-84
: Log a warning when defaulting to "0.0.0"?When there's no
"version"
key, the code defaults to"0.0.0"
. Would you consider logging a warning or debug statement here, so readers of the logs are aware we're falling back to this version, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
airbyte_cdk/manifest_migrations/manifest_migration.py
(1 hunks)airbyte_cdk/manifest_migrations/migration_handler.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations_registry.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte_cdk/manifest_migrations/migrations_registry.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
airbyte_cdk/manifest_migrations/manifest_migration.py (2)
airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py (2)
should_migrate
(19-22)migrate
(24-26)airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py (2)
should_migrate
(22-25)migrate
(27-42)
airbyte_cdk/manifest_migrations/migration_handler.py (2)
airbyte_cdk/manifest_migrations/exceptions.py (1)
ManifestMigrationException
(6-12)airbyte_cdk/manifest_migrations/manifest_migration.py (3)
ManifestMigration
(19-127)migration_version
(41-47)_process_manifest
(68-106)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/manifest_migrations/migration_handler.py (1)
68-69
: Double-check the comparison for applying migrations?Currently, migrations apply when
manifest_version <= migration_instance.migration_version
. Is that your intended logic when the manifest version is equal to the migration's version? In some scenarios, you might want to apply changes only if the manifest is strictly less than the migration version. Could you confirm if equality is desirable, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
airbyte_cdk/connector_builder/connector_builder_handler.py (1)
75-75
: LGTM! Clean integration of the migration flag into the source creation.The modification to pass the migration flag to the
ManifestDeclarativeSource
constructor is straightforward and aligns with the PR objectives. This enables the optional migration functionality while keeping it disabled by default.A tiny suggestion: would adding a brief inline comment here be helpful to explain the purpose of this parameter for future developers? Something like
# Enable manifest migration if requested by UI
? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/connector_builder/connector_builder_handler.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Analyze (python)
…migrations-module
@bazarnov I'm getting a schema validation error on a simple manifest like this:
Here is the error:
|
@lmossman This PR doesn't hold the FYI: you can effectively test all the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (8)
airbyte_cdk/manifest_migrations/README.md (8)
1-4
: Clarify module directory path in introduction
Could we specify the exact path to this directory (e.g.,airbyte_cdk/manifest_migrations
) in the opening lines to orient new contributors? This will make it clearer where to find the code. wdyt?
5-12
: Enhance file naming convention clarity
Maybe we could add a brief note on how the<order>
integer interacts with semantic versioning (e.g., a higher<order>
runs later within the same version group)? A quick example mapping filename → execution order might help. wdyt?
13-20
: Link to base class and type definitions
Should we include a hyperlink or file path reference to whereManifestMigration
andManifestType
are defined (e.g.,manifest_migration.py
)? That way readers can jump straight to the base class docs. wdyt?🧰 Tools
🪛 LanguageTool
[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...(UNLIKELY_OPENING_PUNCTUATION)
28-30
: Link to example migration files
Would it be useful to turn the file names in this section into clickable links (or at least show their relative paths, e.g.,manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py
)? That could speed up onboarding. wdyt?
31-35
: Reference the registry implementation file
Could we mention the exact file name and path for the registry logic (e.g.,manifest_migrations/migrations_registry.py
)? A pointer here would help maintainers find the auto-discovery code more quickly. wdyt?
36-40
: Include specific test file path in testing instructions
Might we call out the actual unit test location (for example:unit_tests/sources/declarative/migrations/test_manifest_migration.py
)? That makes it straightforward for folks to see a working test. wdyt?
46-57
: Ensure consistent snippet formatting
Could we standardize the code block to use four‑space indentation throughout and add a blank line before the class definition? That keeps the example aligned with our style guide. wdyt?
59-66
: Add links to in-code docstrings or examples
What do you think about adding direct links (or file path references) to the docstrings inmanifest_migration.py
and the two example migrations? It may save a few clicks for new contributors exploring the code. wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/manifest_migrations/README.md
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
airbyte_cdk/manifest_migrations/README.md
[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...
(UNLIKELY_OPENING_PUNCTUATION)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/manifest_migrations/README.md (1)
21-27
: Clarify version comparison semantics
Can we indicate whether the version extracted from the class name is compared numerically (e.g., using a SemVer parser) or lexically? It’d be helpful to know exactly how6.45.2
is parsed and compared. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
airbyte_cdk/manifest_migrations/README.md (7)
5-12
: Clarify<order>
intent in file naming
The pattern<description>_v<major>_<minor>_<patch>__<order>.py
is clear, but should we call out that<order>
starts at0
and increments for multiple migrations in the same version? Adding that note could prevent confusion. wdyt?
13-20
: Add import guidance for the base class
You explain how to define the migration class really well, but would linking to the actual import path ofManifestMigration
in the text help readers avoid a “file not found” surprise? For example, mentioning the module pathairbyte_cdk.manifest_migrations.manifest_migration
inline. wdyt?🧰 Tools
🪛 LanguageTool
[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...(UNLIKELY_OPENING_PUNCTUATION)
18-19
: Tiny punctuation tweak
There's a double space before “3.” after the period in themigrate
description, which tripped a style check. Could we collapse to a single space for consistency? wdyt?🧰 Tools
🪛 LanguageTool
[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...(UNLIKELY_OPENING_PUNCTUATION)
21-24
: Expand on version comparison logic
The note “Only manifests with a version less than or equal to the migration version…” is useful—should we mention whether this uses SemVer parsing or simple string comparison? Clarifying the comparison mechanism could help future contributors understand edge cases. wdyt?
25-30
: Link out to concrete examples
Pointing to the two reference scripts is great—do you think we should hyperlink the filenames to those migration files in the repo so readers can click straight to the implementations? wdyt?
31-35
: Highlight automatic discovery behavior
It's clear migrations are auto-registered—would it be helpful to note that developers shouldn’t add/remove entries inmigrations_registry.py
manually, or mention how conflicts (e.g., duplicate order) are handled? wdyt?
36-40
: Include a sample test snippet?
The testing section sets expectations—should we add a minimal code snippet showing how to instantiate a migration and assert its behavior, similar to the unit tests? That could speed up onboarding. wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/manifest_migrations/README.md
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
airbyte_cdk/manifest_migrations/README.md (1)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
🪛 LanguageTool
airbyte_cdk/manifest_migrations/README.md
[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...
(UNLIKELY_OPENING_PUNCTUATION)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Analyze (python)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (3)
airbyte_cdk/manifest_migrations/README.md (3)
1-4
: Great clear intro!
The overview neatly explains what lives in this directory. Would it help to link directly to theManifestMigrationHandler
or the base class in code so readers can jump straight into the implementation? wdyt?
41-57
: Example skeleton looks spot‑on
The code block correctly uses the updated import path (airbyte_cdk.manifest_migrations.manifest_migration
), matching our module layout. Nice!
61-63
: Additional notes are clear
The reminders about not editing the registry and usingNON_MIGRATABLE_TYPES
are on point. Looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py (2)
19-22
: Consider adding method-level documentationFor better maintainability, it might be helpful to add docstrings to the
should_migrate
andmigrate
methods, similar to how they're documented in the base class. This would make it clearer what each method does specifically for this migration. wdyt?Also applies to: 24-28
24-28
: Add logging for better traceabilitySince this is changing the structure of manifests, adding some debug logging could help with troubleshooting if issues arise in production. Maybe log which keys were found and migrated? wdyt?
def migrate(self, manifest: ManifestType) -> None: + # Import statements should be at the top of the file in practice + import logging + logger = logging.getLogger("airbyte.manifest_migrations") + for key in self.original_keys: if key in manifest: + logger.debug(f"Migrating {key} to {self.replacement_key} in {manifest.get('$id', 'unknown component')}") manifest[self.replacement_key] = manifest[key] manifest.pop(key, None)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py
(1 hunks)unit_tests/manifest_migrations/conftest.py
(1 hunks)unit_tests/manifest_migrations/test_manifest_migration.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- unit_tests/manifest_migrations/test_manifest_migration.py
- unit_tests/manifest_migrations/conftest.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py (1)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
ManifestMigration
(19-127)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py (1)
24-28
:✅ Verification successful
Consider handling multiple original keys more explicitly
The current implementation will overwrite
request_body
if both original keys exist in the manifest. Is this the intended behavior? If not, maybe we should add some logic to handle this case explicitly or at least document the precedence? wdyt?
🏁 Script executed:
#!/bin/bash # Let's check if there are any manifests with both keys to understand how common this might be echo "Searching for manifests with both request_body_json and request_body_data..." rg -l "request_body_json.*request_body_data|request_body_data.*request_body_json" --type yaml --type jsonLength of output: 270
No conflicts detected for request_body_json and request_body_data
Our search across manifests didn’t find any instances where both
request_body_json
andrequest_body_data
coexist, and the migration already enforces a clear precedence (the latter key wins). No further changes are needed here.
...st_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main concerns are for dynamic streams and dev UX but I don't want to inflate the scope. Please give me more context so that I can challenge more or accept the changes
…migrations-module
…migrations registered, add the metadata.applied_migrations object to be able to trace the applied MigrationTrace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (9)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
120-126
:self.is_migrated
may be overwritten on every nested componentInside
_process_manifest
the flag is reassigned for every component that passesvalidate
.
If the first component migrates successfully and a later component does not, the final value will beFalse
even though at least one migration succeeded (or vice‑versa).Would you prefer to OR‑accumulate the flag so the caller can easily check whether any change occurred? e.g.:
- self.is_migrated = self.validate(obj) + self.is_migrated = self.is_migrated or self.validate(obj)wdyt?
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (1)
24-27
: Key lookup may raise iftype
is missing
manifest[TYPE_TAG] == self.component_type
will raise aKeyError
if an object without atype
field somehow reaches this method.Would wrapping the condition in
manifest.get(TYPE_TAG) == …
be safer, or do you guarantee upstream filtering?
Just checking. 🙂airbyte_cdk/manifest_migrations/migrations_registry.py (1)
39-47
: Type annotations & base‑class filtering
- mypy complains about an untyped parameter – annotating
module: types.ModuleType
will silence it.- The current
issubclass(obj, ManifestMigration)
test also matches the base class itself; perhaps exclude it?-from typing import Dict, List, Type +from types import ModuleType +from typing import Dict, List, Type @@ -def _get_migration_class(module) -> type: +def _get_migration_class(module: ModuleType) -> Type[ManifestMigration]: @@ - if issubclass(obj, ManifestMigration): + if obj is not ManifestMigration and issubclass(obj, ManifestMigration):wdyt?
🧰 Tools
🪛 GitHub Actions: Linters
[error] 39-39: mypy: Function is missing a type annotation for one or more arguments [no-untyped-def]
airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py (1)
29-31
: Minor: safe pop unnecessary?After copying the value we know
url_base
exists, sopop
without default is sufficient (and clearer).
Feel like simplifying?- manifest[self.replacement_key] = manifest[self.original_key] - manifest.pop(self.original_key, None) + manifest[self.replacement_key] = manifest.pop(self.original_key)wdyt?
airbyte_cdk/manifest_migrations/migration_handler.py (1)
55-56
: Dictionary iteration order is undefined – migrations may run out of order
MANIFEST_MIGRATIONS.items()
does not guarantee ordering by semantic version.
Would wrapping it in asorted(MANIFEST_MIGRATIONS.items(), key=lambda x: Version(x[0]))
call ensure deterministic execution, especially when multiple versions are present?airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py (1)
37-45
: Trailing‑slash handling could be simplified & avoid double slashesAppending a
/
unconditionally may create//
whenpath
starts with/
.
Could usingurljoin
alone be clearer?base = replacement_key_value.rstrip("/") + "/" manifest[self.replacement_key] = urljoin(base, original_key_value.lstrip("/"))This trims/ensures exactly one slash between the pieces.
Thoughts?airbyte_cdk/manifest_migrations/migrations/registry.yaml (1)
7-7
: Tiny YAML nit – trailing space breaks some lintersLine 7 has a trailing space after
6.45.2
.
Would removing it save us from the YAML‑lint warning?- - version: 6.45.2 + - version: 6.45.2🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 7-7: trailing spaces
(trailing-spaces)
airbyte_cdk/manifest_migrations/README.md (2)
21-28
: Refine description wording?
Would you consider replacing "A short description of the migration" with "A brief description of the migration" to make it more concise? wdyt?- - `description`: A short description of the migration + - `description`: A brief description of the migration🧰 Tools
🪛 LanguageTool
[style] ~27-~27: Consider using the synonym “brief” (= concise, using a few words, not lasting long) to strengthen your wording.
Context: ...for the version -description
: A short description of the migration Exampl...(QUICK_BRIEF)
40-43
: Optional: clarify test location.
Would you add the path to the test directory (e.g.,unit_tests/sources/declarative/migrations
) to guide new contributors on where to place their tests? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
airbyte_cdk/manifest_migrations/README.md
(1 hunks)airbyte_cdk/manifest_migrations/__init__.py
(1 hunks)airbyte_cdk/manifest_migrations/exceptions.py
(1 hunks)airbyte_cdk/manifest_migrations/manifest_migration.py
(1 hunks)airbyte_cdk/manifest_migrations/migration_handler.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations/__init__.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py
(1 hunks)airbyte_cdk/manifest_migrations/migrations/registry.yaml
(1 hunks)airbyte_cdk/manifest_migrations/migrations_registry.py
(1 hunks)unit_tests/manifest_migrations/conftest.py
(1 hunks)unit_tests/manifest_migrations/test_manifest_migration.py
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- airbyte_cdk/manifest_migrations/migrations/init.py
- airbyte_cdk/manifest_migrations/init.py
🚧 Files skipped from review as they are similar to previous changes (3)
- airbyte_cdk/manifest_migrations/exceptions.py
- unit_tests/manifest_migrations/test_manifest_migration.py
- unit_tests/manifest_migrations/conftest.py
🧰 Additional context used
🧠 Learnings (1)
airbyte_cdk/manifest_migrations/README.md (1)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
🧬 Code Graph Analysis (4)
airbyte_cdk/manifest_migrations/manifest_migration.py (3)
airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py (3)
should_migrate
(24-27)migrate
(29-31)validate
(33-41)airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py (3)
should_migrate
(27-30)migrate
(32-47)validate
(49-57)airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (3)
should_migrate
(24-27)migrate
(29-33)validate
(35-41)
airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py (3)
airbyte_cdk/manifest_migrations/manifest_migration.py (4)
ManifestMigration
(38-134)should_migrate
(50-57)migrate
(60-65)validate
(68-73)airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py (3)
should_migrate
(24-27)migrate
(29-31)validate
(33-41)airbyte_cdk/sources/types.py (1)
keys
(137-138)
airbyte_cdk/manifest_migrations/migrations_registry.py (1)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
ManifestMigration
(38-134)
airbyte_cdk/manifest_migrations/migration_handler.py (2)
airbyte_cdk/manifest_migrations/exceptions.py (1)
ManifestMigrationException
(6-12)airbyte_cdk/manifest_migrations/manifest_migration.py (4)
ManifestMigration
(38-134)MigrationTrace
(22-35)_process_manifest
(94-134)as_dict
(34-35)
🪛 GitHub Actions: Linters
airbyte_cdk/manifest_migrations/manifest_migration.py
[error] 34-34: mypy: Missing type parameters for generic type "dict" [type-arg]
airbyte_cdk/manifest_migrations/migrations_registry.py
[error] 24-24: mypy: Missing return statement [return]
[error] 39-39: mypy: Function is missing a type annotation for one or more arguments [no-untyped-def]
[error] 56-56: mypy: Need type annotation for "migrations" (hint: "migrations: dict[, ] = ...") [var-annotated]
airbyte_cdk/manifest_migrations/migration_handler.py
[error] 89-89: mypy: "ManifestMigration" has no attribute "name" [attr-defined]
🪛 LanguageTool
airbyte_cdk/manifest_migrations/README.md
[style] ~27-~27: Consider using the synonym “brief” (= concise, using a few words, not lasting long) to strengthen your wording.
Context: ...for the version - description
: A short description of the migration Exampl...
(QUICK_BRIEF)
🪛 YAMLlint (1.35.1)
airbyte_cdk/manifest_migrations/migrations/registry.yaml
[error] 7-7: trailing spaces
(trailing-spaces)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (8)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
75-93
: Guard against missingtype
key to avoidKeyError
?
_is_migratable_type
indexesobj[TYPE_TAG]
, assuming the key is present.
Although callers currently check_is_component
first, a direct external call could bypass that and raise.Would adding a defensive
TYPE_TAG in obj
check make the helper safer, or do you consider the current contract sufficient?
Just raising the question for clarity. 🙂airbyte_cdk/manifest_migrations/README.md (7)
1-4
: Clear introduction.
This section concisely explains the purpose of the directory. Nice!
5-12
: Creating a new migration.
Instructions here are clear and straightforward. Looks good!
13-20
: Defining the migration class.
Good description of the required methods to implement.
29-38
: Registry YAML example.
Example is well-structured and follows expected formatting.
44-49
: Migration discovery.
This section succinctly explains how migrations are discovered and when to avoid manual registry edits.
50-69
: Example migration skeleton.
The skeleton is clear, and the import path matches the actual module location. Great!
71-73
: Further references.
Linking to docstrings and examples is helpful for readers.
..._cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py
Show resolved
Hide resolved
airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this one. Just a nit comment regarding updating the version.
Also nit: I guess I'm not too worried about this because we shouldn't maintain the migrations but maybe for the next ones we should have tests to provide documentation about the cases is supports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Maxime is happy Im happy :)
What
Resolves:
How
Manifest Migrations:
Introduced a framework to handle manifest migrations in the
Airbyte CDK
to apply transformations on given manifest.Added migration logic to convert
url_base
andpath
tourl
.http_requester_url_base_to_url.py
http_requester_path_to_url.py
Created migration files with clear
versioning
andorder
handling.Registered migration classes dynamically and applied them in order.
Unit Tests and Documentation:
unit_tests/manifest_migrations/test_manifest_migration.py
README.md
in themanifest/migrations
directory to document the migration framework.User Impact
migrate_manifest: bool = False
by default, to not to have any regressions, before we're ready to use it within the UI (/resolve
should be havingmigrate: bool
flag to set the migration toTrue
)Summary by CodeRabbit
New Features
url_base
andpath
intourl
, and unifyrequest_body_json
/request_body_data
intorequest_body
.Documentation
Bug Fixes
Tests