Skip to content

fix: (CDK) (Declarative) - Add Manifest Migration module #485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

bazarnov
Copy link
Contributor

@bazarnov bazarnov commented Apr 16, 2025

What

Resolves:

How

Manifest Migrations:

  • Introduced a framework to handle manifest migrations in the Airbyte CDK to apply transformations on given manifest.

  • Added migration logic to convert url_base and path to url.

    • http_requester_url_base_to_url.py
    • http_requester_path_to_url.py
  • Created migration files with clear versioning and order handling.

  • Registered migration classes dynamically and applied them in order.

Unit Tests and Documentation:

  • Added unit tests for manifest migrations.
    • unit_tests/manifest_migrations/test_manifest_migration.py
  • Created the README.md in the manifest/migrations directory to document the migration framework.
    • added documentation and examples make it easier to adopt the new changes.

User Impact

  • No impact is expected, this is not a breaking change.
  • The migration execution is hidden under the migrate_manifest: bool = False by default, to not to have any regressions, before we're ready to use it within the UI (/resolve should be having migrate: bool flag to set the migration to True)

Summary by CodeRabbit

  • New Features

    • Introduced automated manifest migration support for declarative sources, enabling seamless updates to newer manifest formats.
    • Added manifest migration handler to apply and track migrations, with detailed migration metadata.
    • Implemented migration logic to consolidate url_base and path into url, and unify request_body_json/request_body_data into request_body.
  • Documentation

    • Added comprehensive README explaining manifest migrations, usage, and guidelines for adding new migrations.
  • Bug Fixes

    • Ensured deprecated manifest fields are automatically updated to supported formats, improving compatibility and reliability.
  • Tests

    • Added unit tests and fixtures to validate migration behavior and ensure correctness of manifest transformations.

@bazarnov bazarnov marked this pull request as ready for review April 16, 2025 11:14
@Copilot Copilot AI review requested due to automatic review settings April 16, 2025 11:14
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Copy link
Contributor

coderabbitai bot commented Apr 16, 2025

📝 Walkthrough

Walkthrough

This change introduces a new manifest migration framework for Airbyte's CDK, enabling systematic upgrades and transformations of declarative source manifests. It adds migration base classes, migration implementations, a migration registry, and a migration handler for orchestrating and recording applied migrations. The system is integrated into the manifest resolution flow, with new tests and fixtures verifying the correctness of migrations such as URL field consolidation and request body key unification. Documentation is added to guide contributors on creating and registering new migrations. The connector builder handler is updated to support conditional manifest migration based on configuration.

Changes

File(s) Change Summary
airbyte_cdk/manifest_migrations/README.md Added documentation explaining manifest migrations, how to create and register them, and testing guidelines.
airbyte_cdk/manifest_migrations/__init__.py, airbyte_cdk/manifest_migrations/migrations/__init__.py Added copyright header files.
airbyte_cdk/manifest_migrations/exceptions.py Introduced ManifestMigrationException for migration-specific error handling.
airbyte_cdk/manifest_migrations/manifest_migration.py Added ManifestMigration abstract base class and MigrationTrace dataclass for migration metadata and recursive manifest processing.
airbyte_cdk/manifest_migrations/migration_handler.py Added ManifestMigrationHandler class to apply and track manifest migrations, updating versions and recording migration traces.
airbyte_cdk/manifest_migrations/migrations_registry.py Added dynamic migration discovery and registry loading from YAML, mapping versions to ordered migration classes.
airbyte_cdk/manifest_migrations/migrations/registry.yaml Introduced migration registry YAML defining three migrations for version 6.45.2: url_base to url, path to url, and request body key unification.
airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py Added migration to rename url_base to url in HttpRequester components.
airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py Added migration to consolidate path into url for HttpRequester components.
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py Added migration to unify request_body_json and request_body_data into request_body in HttpRequester components.
airbyte_cdk/sources/declarative/manifest_declarative_source.py Updated to support manifest migration via a new migrate_manifest parameter and migration logic in the post-processing step.
airbyte_cdk/connector_builder/connector_builder_handler.py Added should_migrate_manifest function and passed migration flag to ManifestDeclarativeSource.
unit_tests/manifest_migrations/conftest.py Added test fixtures for various manifest states (pre- and post-migration) for URL and request body fields.
unit_tests/manifest_migrations/test_manifest_migration.py Added tests to verify migration of URL fields and request body keys, and to ensure idempotency of migrations.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ConnectorBuilderHandler
    participant ManifestDeclarativeSource
    participant ManifestMigrationHandler
    participant MigrationRegistry

    User->>ConnectorBuilderHandler: create_source(config)
    ConnectorBuilderHandler->>ManifestDeclarativeSource: instantiate (migrate_manifest flag)
    ManifestDeclarativeSource->>ManifestMigrationHandler: _migrate_manifest() (if migrate_manifest)
    ManifestMigrationHandler->>MigrationRegistry: Get registered migrations
    loop For each migration in order
        ManifestMigrationHandler->>ManifestMigrationHandler: Apply migration if applicable
        ManifestMigrationHandler->>ManifestMigrationHandler: Record migration trace
    end
    ManifestMigrationHandler-->>ManifestDeclarativeSource: Return migrated manifest
    ManifestDeclarativeSource-->>ConnectorBuilderHandler: Source ready
Loading

Suggested labels

enhancement

Suggested reviewers

  • lmossman
  • maxi297

Would you like to add more migration examples to the documentation or expand test coverage for additional manifest scenarios, wdyt?


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3ee514 and b202be8.

📒 Files selected for processing (2)
  • airbyte_cdk/connector_builder/connector_builder_handler.py (2 hunks)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/connector_builder/connector_builder_handler.py
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)
  • GitHub Check: Pytest (Fast)
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (17)
airbyte_cdk/manifest_migrations/migrations_registry.py (3)

13-17: Nice dynamic importing logic!
This approach ensures that adding new migration files is painless. Would you consider adding a debug log to confirm which modules are loaded, wdyt?


19-25: Ordering key function looks sensible!
If someone forgets to add the double underscore with an integer suffix, it defaults to 0. Would you like to raise a warning in that case, wdyt?


27-51: Discovery function is thorough!
It ensures all migration classes are collected and avoids duplicates. Maybe you'd like to unify the loops to reduce repetition or log each discovered class for clarity, wdyt?

airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py (1)

31-43: URL construction logic is concise!
Perhaps consider logging or warning the user when the url is changed, wdyt?

airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py (3)

15-17: Well-defined class attributes!

The component type and key names are clearly defined as class attributes, making the code more maintainable. Would using constants for these values be more consistent with the rest of the codebase, wdyt?


19-22: Effective implementation of should_migrate!

The method correctly checks if the component type matches and if the original key exists in the manifest. One suggestion: would it be clearer to check self.original_key in manifest directly rather than converting keys to a list first, wdyt?

-    def should_migrate(self, manifest: ManifestType) -> bool:
-        return manifest[TYPE_TAG] == self.component_type and self.original_key in list(
-            manifest.keys()
-        )
+    def should_migrate(self, manifest: ManifestType) -> bool:
+        return manifest[TYPE_TAG] == self.component_type and self.original_key in manifest

24-26: Simple and effective migration implementation!

The migration logic is clean and straightforward - copy the value from the original key to the replacement key and remove the original. Consider adding a null check before accessing manifest[self.original_key] for extra safety, wdyt?

airbyte_cdk/manifest_migrations/README.md (3)

5-20: Clear instructions for adding new migrations!

The step-by-step instructions for adding new migrations are comprehensive and well-structured. There appears to be an extra space after the period in item 2, line 19 - might want to fix that for consistency, wdyt?

🧰 Tools
🪛 LanguageTool

[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...

(UNLIKELY_OPENING_PUNCTUATION)


36-40: Concise testing guidelines!

The testing guidelines are clear and to the point. Would it be helpful to include a link or example of test file structure to guide developers further, wdyt?


41-57: Excellent example migration skeleton!

The example skeleton is thorough and helps developers understand how to implement their own migrations. Minor note: the import path in the example seems to be from airbyte_cdk.sources.declarative.migrations.manifest.manifest_migration but in the actual code it's from airbyte_cdk.manifest_migrations.manifest_migration. Should we update the example to match the actual path, wdyt?

airbyte_cdk/manifest_migrations/migration_handler.py (1)

49-62: Good encapsulation of single migration handling!

The _handle_migration method cleanly handles a single migration and wraps any exceptions in a ManifestMigrationException. One suggestion: would it make sense to log the exception before re-raising it for better debugging, wdyt?

unit_tests/manifest_migrations/test_manifest_migration.py (1)

17-30: Nice test coverage for test_manifest_resolve_migrate.
Would you consider adding a negative test scenario with missing path or url_base to ensure graceful handling of partial migrations if the manifest is incomplete? wdyt?

airbyte_cdk/sources/declarative/manifest_declarative_source.py (2)

74-84: Great addition of migrate_manifest parameter.
Would you consider clarifying in the docstring that the migrations might update or remove certain manifest fields? wdyt?


100-104: Consider capturing migration failures.
Right now, if .apply_migrations() raises an exception, do we want to provide a user-friendly error message or fallback? wdyt?

unit_tests/manifest_migrations/conftest.py (1)

10-195: Comprehensive fixture for manifest_with_url_base_to_migrate_to_url.
Would it be helpful to break this fixture into multiple smaller fixtures for better test clarity? wdyt?

airbyte_cdk/manifest_migrations/manifest_migration.py (2)

17-37: Nicely structured abstract class for manifest migrations.
Would you consider providing a default should_migrate logic that checks the class’s version vs the manifest version, so that each migration doesn't need to re-implement if the logic is standard? wdyt?


69-108: Recursive manifest processing looks robust.
Are you concerned about performance with deeply nested manifests? Possibly we could short-circuit or use a queue-based approach. wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf998bd and 88f9e30.

📒 Files selected for processing (11)
  • airbyte_cdk/manifest_migrations/README.md (1 hunks)
  • airbyte_cdk/manifest_migrations/exceptions.py (1 hunks)
  • airbyte_cdk/manifest_migrations/manifest_migration.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migration_handler.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations_registry.py (1 hunks)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py (3 hunks)
  • airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py (2 hunks)
  • unit_tests/manifest_migrations/conftest.py (1 hunks)
  • unit_tests/manifest_migrations/test_manifest_migration.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
airbyte_cdk/manifest_migrations/migration_handler.py (2)
airbyte_cdk/manifest_migrations/exceptions.py (1)
  • ManifestMigrationException (6-12)
airbyte_cdk/manifest_migrations/manifest_migration.py (2)
  • ManifestMigration (17-137)
  • _process_manifest (69-107)
airbyte_cdk/manifest_migrations/migrations_registry.py (1)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
  • ManifestMigration (17-137)
🪛 LanguageTool
airbyte_cdk/manifest_migrations/README.md

[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...

(UNLIKELY_OPENING_PUNCTUATION)

⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (19)
airbyte_cdk/manifest_migrations/exceptions.py (1)

6-12: Exception class looks good!
This custom exception provides a clear message for manifest migration errors. Maybe you could document the message parameter in the docstring for completeness, wdyt?

airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py (1)

22-25: Logic for detecting migratable components looks clean!
No issues here.

airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py (2)

1-6: Clean imports and good organization!

The imports are well-organized and focused, bringing in only the necessary components from the manifest migration framework. Nice job keeping this focused.


8-13: Clear docstring explaining the migration purpose!

The docstring clearly explains what this migration does: converting url_base to url in HttpRequester components. This will be helpful for other developers understanding the code's purpose.

airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py (2)

7-7: Updated import for more specific typing!

Good addition of the Dict type to the imports, making the code more explicit.


98-98: Improved return type specificity!

Nice improvement changing the return type from Mapping[str, Any] to Dict[str, Any]. This more accurately represents the actual return value (a dictionary), which helps with static type checking.

airbyte_cdk/manifest_migrations/README.md (4)

1-4: Great introduction to manifest migrations!

The introduction clearly explains the purpose of manifest migrations in the context of the Airbyte CDK.


21-24: Good explanation of versioning!

Clear explanation of how migration versioning works and when migrations will be applied.


31-35: Helpful information about the migration registry!

The explanation of how migrations are automatically discovered and registered is valuable for developers. It ensures they don't manually modify the registry.


59-66: Helpful additional notes section!

Good inclusion of additional notes and reference to documentation in other files. This helps guide developers to find more information when needed.

airbyte_cdk/manifest_migrations/migration_handler.py (4)

1-4: Standard header with copyright information.

The file follows the standard header format with appropriate copyright information.


5-17: Well-organized imports!

The imports are properly structured, importing specific classes and organizing them by module.


20-28: Good handler initialization with defensive copying!

I like that you're creating a deep copy of the manifest in the constructor. This preserves the original state in case of errors during migration.


29-47: Well-implemented migration application with error handling!

The method handles migrations sequentially and properly catches exceptions to return the original manifest on failure. The docstring is thorough and explains the behavior well.

unit_tests/manifest_migrations/test_manifest_migration.py (1)

32-47: Idempotency test is clear and concise.
Everything looks good. No issues found here.

airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)

18-20: Migration handler import recognized.
No concerns here.

unit_tests/manifest_migrations/conftest.py (2)

197-494: Fixture expected_manifest_with_url_base_migrated_to_url is well-structured.
Everything is consistent and thorough.


496-601: manifest_with_migrated_url_base_and_path_is_joined_to_url fixture is effectively verifying idempotent migrations.
No issues found.

airbyte_cdk/manifest_migrations/manifest_migration.py (1)

118-138: Regex-based version extraction is straightforward.
No issues here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
airbyte_cdk/manifest_migrations/manifest_migration.py (2)

22-29: Remove or update the "kwargs" references in the docstrings?

In both should_migrate (lines 22-29) and migrate (lines 33-39), the docstrings mention kwargs, but the methods themselves do not accept such parameters. Would you consider removing or clarifying these references to match the actual function signatures, wdyt?

Also applies to: 33-39


68-108: Consider protecting against cyclic references in manifests?

Recursively processing a nested dictionary or list can be risky if there's ever a possibility of cyclical references. This could theoretically cause an infinite loop. Would you like to add a safeguard (e.g., tracking visited nodes) or confirm that the manifests never contain cycles, wdyt?

airbyte_cdk/manifest_migrations/migration_handler.py (3)

43-53: Improve user feedback on migration failure?

Right now, if any migration fails, you return the original manifest without explicit logging or messaging to the user about which migration failed. Would you consider adding a more direct log statement or returning additional diagnostic info so the user knows a particular migration was reverted, wdyt?


74-75: Consider catching only known migration exceptions?

Here, the code catches all exceptions before re-raising them as ManifestMigrationException. Do we anticipate unexpected exceptions (like KeyError or network errors)? Would you like to narrow the exception scope to known issues or at least log the error class name, wdyt?


84-84: Log a warning when defaulting to "0.0.0"?

When there's no "version" key, the code defaults to "0.0.0". Would you consider logging a warning or debug statement here, so readers of the logs are aware we're falling back to this version, wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88f9e30 and ebc854d.

📒 Files selected for processing (3)
  • airbyte_cdk/manifest_migrations/manifest_migration.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migration_handler.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations_registry.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/manifest_migrations/migrations_registry.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
airbyte_cdk/manifest_migrations/manifest_migration.py (2)
airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py (2)
  • should_migrate (19-22)
  • migrate (24-26)
airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url_v6_45_2__1.py (2)
  • should_migrate (22-25)
  • migrate (27-42)
airbyte_cdk/manifest_migrations/migration_handler.py (2)
airbyte_cdk/manifest_migrations/exceptions.py (1)
  • ManifestMigrationException (6-12)
airbyte_cdk/manifest_migrations/manifest_migration.py (3)
  • ManifestMigration (19-127)
  • migration_version (41-47)
  • _process_manifest (68-106)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/manifest_migrations/migration_handler.py (1)

68-69: Double-check the comparison for applying migrations?

Currently, migrations apply when manifest_version <= migration_instance.migration_version. Is that your intended logic when the manifest version is equal to the migration's version? In some scenarios, you might want to apply changes only if the manifest is strictly less than the migration version. Could you confirm if equality is desirable, wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
airbyte_cdk/connector_builder/connector_builder_handler.py (1)

75-75: LGTM! Clean integration of the migration flag into the source creation.

The modification to pass the migration flag to the ManifestDeclarativeSource constructor is straightforward and aligns with the PR objectives. This enables the optional migration functionality while keeping it disabled by default.

A tiny suggestion: would adding a brief inline comment here be helpful to explain the purpose of this parameter for future developers? Something like # Enable manifest migration if requested by UI? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d30ae26 and c494934.

📒 Files selected for processing (1)
  • airbyte_cdk/connector_builder/connector_builder_handler.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)

@lmossman
Copy link
Contributor

lmossman commented Apr 22, 2025

@bazarnov I'm getting a schema validation error on a simple manifest like this:

version: 6.44.0

type: DeclarativeSource

check:
  type: CheckStream
  stream_names:
    - pokemon

streams:
  - type: DeclarativeStream
    name: pokemon
    retriever:
      type: SimpleRetriever
      requester:
        type: HttpRequester
        url_base: https://pokeapi.co/api/v2/
        path: pokemon
        http_method: GET
      record_selector:
        type: RecordSelector
        extractor:
          type: DpathExtractor
          field_path: []
      decoder:
        type: JsonDecoder
    schema_loader:
      type: InlineSchemaLoader
      schema:
        $ref: "#/schemas/pokemon"

spec:
  type: Spec
  connection_specification:
    type: object
    $schema: http://json-schema.org/draft-07/schema#
    required: []
    properties: {}
    additionalProperties: true

metadata:
  autoImportSchema:
    pokemon: true
  testedStreams:
    pokemon:
      streamHash: null
  assist: {}

schemas:
  pokemon:
    type: object
    $schema: http://json-schema.org/draft-07/schema#
    additionalProperties: true
    properties: {}

Here is the error:

{
    "exceptionStack": "Traceback (most recent call last):\n  File \"/Users/lakemossman/code/airbyte-python-cdk/airbyte_cdk/sources/declarative/manifest_declarative_source.py\", line 306, in _validate_source\n    validate(self._source_config, declarative_component_schema)\n  File \"/Users/lakemossman/code/airbyte-python-cdk/.venv/lib/python3.10/site-packages/jsonschema/validators.py\", line 1121, in validate\n    raise error\njsonschema.exceptions.ValidationError: 'HttpRequester' is not one of ['CustomRequester']\n\nFailed validating 'enum' in schema[0]['properties']['type']:\n    {'enum': ['CustomRequester'], 'type': 'string'}\n\nOn instance['type']:\n    'HttpRequester'\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/Users/lakemossman/code/airbyte-python-cdk/airbyte_cdk/connector_builder/main.py\", line 104, in <module>\n    print(handle_request(sys.argv[1:]))\n  File \"/Users/lakemossman/code/airbyte-python-cdk/airbyte_cdk/connector_builder/main.py\", line 94, in handle_request\n    source = create_source(config, limits)\n  File \"/Users/lakemossman/code/airbyte-python-cdk/airbyte_cdk/connector_builder/connector_builder_handler.py\", line 71, in create_source\n    return ManifestDeclarativeSource(\n  File \"/Users/lakemossman/code/airbyte-python-cdk/airbyte_cdk/sources/declarative/manifest_declarative_source.py\", line 122, in __init__\n    self._validate_source()\n  File \"/Users/lakemossman/code/airbyte-python-cdk/airbyte_cdk/sources/declarative/manifest_declarative_source.py\", line 308, in _validate_source\n    raise ValidationError(\njsonschema.exceptions.ValidationError: Validation against json schema defined in declarative_component_schema.yaml schema failed\n",
    "exceptionClassName": "io.airbyte.protocol.models.v0.AirbyteTraceMessage",
    "message": "Error handling request: Validation against json schema defined in declarative_component_schema.yaml schema failed"
}

@bazarnov
Copy link
Contributor Author

@lmossman This PR doesn't hold the HttpRequester deprecations/modifications you want to test with the given input manifest you mentioned. Refer to this PR instead: #463

FYI: you can effectively test all the migration + deprecation logic using PoC PR, because all others hold isolated changes for the simplicity of the reviewer.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
airbyte_cdk/manifest_migrations/README.md (8)

1-4: Clarify module directory path in introduction
Could we specify the exact path to this directory (e.g., airbyte_cdk/manifest_migrations) in the opening lines to orient new contributors? This will make it clearer where to find the code. wdyt?


5-12: Enhance file naming convention clarity
Maybe we could add a brief note on how the <order> integer interacts with semantic versioning (e.g., a higher <order> runs later within the same version group)? A quick example mapping filename → execution order might help. wdyt?


13-20: Link to base class and type definitions
Should we include a hyperlink or file path reference to where ManifestMigration and ManifestType are defined (e.g., manifest_migration.py)? That way readers can jump straight to the base class docs. wdyt?

🧰 Tools
🪛 LanguageTool

[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...

(UNLIKELY_OPENING_PUNCTUATION)


28-30: Link to example migration files
Would it be useful to turn the file names in this section into clickable links (or at least show their relative paths, e.g., manifest_migrations/migrations/http_requester_url_base_to_url_v6_45_2__0.py)? That could speed up onboarding. wdyt?


31-35: Reference the registry implementation file
Could we mention the exact file name and path for the registry logic (e.g., manifest_migrations/migrations_registry.py)? A pointer here would help maintainers find the auto-discovery code more quickly. wdyt?


36-40: Include specific test file path in testing instructions
Might we call out the actual unit test location (for example:

unit_tests/sources/declarative/migrations/test_manifest_migration.py

)? That makes it straightforward for folks to see a working test. wdyt?


46-57: Ensure consistent snippet formatting
Could we standardize the code block to use four‑space indentation throughout and add a blank line before the class definition? That keeps the example aligned with our style guide. wdyt?


59-66: Add links to in-code docstrings or examples
What do you think about adding direct links (or file path references) to the docstrings in manifest_migration.py and the two example migrations? It may save a few clicks for new contributors exploring the code. wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c494934 and 9e58f5b.

📒 Files selected for processing (1)
  • airbyte_cdk/manifest_migrations/README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
airbyte_cdk/manifest_migrations/README.md

[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...

(UNLIKELY_OPENING_PUNCTUATION)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/manifest_migrations/README.md (1)

21-27: Clarify version comparison semantics
Can we indicate whether the version extracted from the class name is compared numerically (e.g., using a SemVer parser) or lexically? It’d be helpful to know exactly how 6.45.2 is parsed and compared. wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
airbyte_cdk/manifest_migrations/README.md (7)

5-12: Clarify <order> intent in file naming
The pattern <description>_v<major>_<minor>_<patch>__<order>.py is clear, but should we call out that <order> starts at 0 and increments for multiple migrations in the same version? Adding that note could prevent confusion. wdyt?


13-20: Add import guidance for the base class
You explain how to define the migration class really well, but would linking to the actual import path of ManifestMigration in the text help readers avoid a “file not found” surprise? For example, mentioning the module path airbyte_cdk.manifest_migrations.manifest_migration inline. wdyt?

🧰 Tools
🪛 LanguageTool

[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...

(UNLIKELY_OPENING_PUNCTUATION)


18-19: Tiny punctuation tweak
There's a double space before “3.” after the period in the migrate description, which tripped a style check. Could we collapse to a single space for consistency? wdyt?

🧰 Tools
🪛 LanguageTool

[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...

(UNLIKELY_OPENING_PUNCTUATION)


21-24: Expand on version comparison logic
The note “Only manifests with a version less than or equal to the migration version…” is useful—should we mention whether this uses SemVer parsing or simple string comparison? Clarifying the comparison mechanism could help future contributors understand edge cases. wdyt?


25-30: Link out to concrete examples
Pointing to the two reference scripts is great—do you think we should hyperlink the filenames to those migration files in the repo so readers can click straight to the implementations? wdyt?


31-35: Highlight automatic discovery behavior
It's clear migrations are auto-registered—would it be helpful to note that developers shouldn’t add/remove entries in migrations_registry.py manually, or mention how conflicts (e.g., duplicate order) are handled? wdyt?


36-40: Include a sample test snippet?
The testing section sets expectations—should we add a minimal code snippet showing how to instantiate a migration and assert its behavior, similar to the unit tests? That could speed up onboarding. wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e58f5b and c6b31cc.

📒 Files selected for processing (1)
  • airbyte_cdk/manifest_migrations/README.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
airbyte_cdk/manifest_migrations/README.md (1)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
🪛 LanguageTool
airbyte_cdk/manifest_migrations/README.md

[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...e(self, manifest: ManifestType) -> None`: Perform the migration in-place. 3. **M...

(UNLIKELY_OPENING_PUNCTUATION)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (3)
airbyte_cdk/manifest_migrations/README.md (3)

1-4: Great clear intro!
The overview neatly explains what lives in this directory. Would it help to link directly to the ManifestMigrationHandler or the base class in code so readers can jump straight into the implementation? wdyt?


41-57: Example skeleton looks spot‑on
The code block correctly uses the updated import path (airbyte_cdk.manifest_migrations.manifest_migration), matching our module layout. Nice!


61-63: Additional notes are clear
The reminders about not editing the registry and using NON_MIGRATABLE_TYPES are on point. Looks good to me!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py (2)

19-22: Consider adding method-level documentation

For better maintainability, it might be helpful to add docstrings to the should_migrate and migrate methods, similar to how they're documented in the base class. This would make it clearer what each method does specifically for this migration. wdyt?

Also applies to: 24-28


24-28: Add logging for better traceability

Since this is changing the structure of manifests, adding some debug logging could help with troubleshooting if issues arise in production. Maybe log which keys were found and migrated? wdyt?

def migrate(self, manifest: ManifestType) -> None:
+    # Import statements should be at the top of the file in practice
+    import logging
+    logger = logging.getLogger("airbyte.manifest_migrations")
+    
     for key in self.original_keys:
         if key in manifest:
+            logger.debug(f"Migrating {key} to {self.replacement_key} in {manifest.get('$id', 'unknown component')}")
             manifest[self.replacement_key] = manifest[key]
             manifest.pop(key, None)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6b31cc and ab08a07.

📒 Files selected for processing (3)
  • airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py (1 hunks)
  • unit_tests/manifest_migrations/conftest.py (1 hunks)
  • unit_tests/manifest_migrations/test_manifest_migration.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • unit_tests/manifest_migrations/test_manifest_migration.py
  • unit_tests/manifest_migrations/conftest.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py (1)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
  • ManifestMigration (19-127)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body_v6_45_2__2.py (1)

24-28:

✅ Verification successful

Consider handling multiple original keys more explicitly

The current implementation will overwrite request_body if both original keys exist in the manifest. Is this the intended behavior? If not, maybe we should add some logic to handle this case explicitly or at least document the precedence? wdyt?


🏁 Script executed:

#!/bin/bash
# Let's check if there are any manifests with both keys to understand how common this might be
echo "Searching for manifests with both request_body_json and request_body_data..."
rg -l "request_body_json.*request_body_data|request_body_data.*request_body_json" --type yaml --type json

Length of output: 270


No conflicts detected for request_body_json and request_body_data

Our search across manifests didn’t find any instances where both request_body_json and request_body_data coexist, and the migration already enforces a clear precedence (the latter key wins). No further changes are needed here.

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main concerns are for dynamic streams and dev UX but I don't want to inflate the scope. Please give me more context so that I can challenge more or accept the changes

…migrations registered, add the metadata.applied_migrations object to be able to trace the applied MigrationTrace
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (9)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)

120-126: self.is_migrated may be overwritten on every nested component

Inside _process_manifest the flag is reassigned for every component that passes validate.
If the first component migrates successfully and a later component does not, the final value will be False even though at least one migration succeeded (or vice‑versa).

Would you prefer to OR‑accumulate the flag so the caller can easily check whether any change occurred? e.g.:

-                    self.is_migrated = self.validate(obj)
+                    self.is_migrated = self.is_migrated or self.validate(obj)

wdyt?

airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (1)

24-27: Key lookup may raise if type is missing

manifest[TYPE_TAG] == self.component_type will raise a KeyError if an object without a type field somehow reaches this method.

Would wrapping the condition in manifest.get(TYPE_TAG) == … be safer, or do you guarantee upstream filtering?
Just checking. 🙂

airbyte_cdk/manifest_migrations/migrations_registry.py (1)

39-47: Type annotations & base‑class filtering

  1. mypy complains about an untyped parameter – annotating module: types.ModuleType will silence it.
  2. The current issubclass(obj, ManifestMigration) test also matches the base class itself; perhaps exclude it?
-from typing import Dict, List, Type
+from types import ModuleType
+from typing import Dict, List, Type
@@
-def _get_migration_class(module) -> type:
+def _get_migration_class(module: ModuleType) -> Type[ManifestMigration]:
@@
-        if issubclass(obj, ManifestMigration):
+        if obj is not ManifestMigration and issubclass(obj, ManifestMigration):

wdyt?

🧰 Tools
🪛 GitHub Actions: Linters

[error] 39-39: mypy: Function is missing a type annotation for one or more arguments [no-untyped-def]

airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py (1)

29-31: Minor: safe pop unnecessary?

After copying the value we know url_base exists, so pop without default is sufficient (and clearer).
Feel like simplifying?

-        manifest[self.replacement_key] = manifest[self.original_key]
-        manifest.pop(self.original_key, None)
+        manifest[self.replacement_key] = manifest.pop(self.original_key)

wdyt?

airbyte_cdk/manifest_migrations/migration_handler.py (1)

55-56: Dictionary iteration order is undefined – migrations may run out of order

MANIFEST_MIGRATIONS.items() does not guarantee ordering by semantic version.
Would wrapping it in a sorted(MANIFEST_MIGRATIONS.items(), key=lambda x: Version(x[0])) call ensure deterministic execution, especially when multiple versions are present?

airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py (1)

37-45: Trailing‑slash handling could be simplified & avoid double slashes

Appending a / unconditionally may create // when path starts with /.
Could using urljoin alone be clearer?

base = replacement_key_value.rstrip("/") + "/"
manifest[self.replacement_key] = urljoin(base, original_key_value.lstrip("/"))

This trims/ensures exactly one slash between the pieces.
Thoughts?

airbyte_cdk/manifest_migrations/migrations/registry.yaml (1)

7-7: Tiny YAML nit – trailing space breaks some linters

Line 7 has a trailing space after 6.45.2.
Would removing it save us from the YAML‑lint warning?

-  - version: 6.45.2 
+  - version: 6.45.2
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 7-7: trailing spaces

(trailing-spaces)

airbyte_cdk/manifest_migrations/README.md (2)

21-28: Refine description wording?
Would you consider replacing "A short description of the migration" with "A brief description of the migration" to make it more concise? wdyt?

- - `description`: A short description of the migration
+ - `description`: A brief description of the migration
🧰 Tools
🪛 LanguageTool

[style] ~27-~27: Consider using the synonym “brief” (= concise, using a few words, not lasting long) to strengthen your wording.
Context: ...for the version - description: A short description of the migration Exampl...

(QUICK_BRIEF)


40-43: Optional: clarify test location.
Would you add the path to the test directory (e.g., unit_tests/sources/declarative/migrations) to guide new contributors on where to place their tests? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0308817 and 38b7362.

📒 Files selected for processing (13)
  • airbyte_cdk/manifest_migrations/README.md (1 hunks)
  • airbyte_cdk/manifest_migrations/__init__.py (1 hunks)
  • airbyte_cdk/manifest_migrations/exceptions.py (1 hunks)
  • airbyte_cdk/manifest_migrations/manifest_migration.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migration_handler.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations/__init__.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations/registry.yaml (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations_registry.py (1 hunks)
  • unit_tests/manifest_migrations/conftest.py (1 hunks)
  • unit_tests/manifest_migrations/test_manifest_migration.py (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • airbyte_cdk/manifest_migrations/migrations/init.py
  • airbyte_cdk/manifest_migrations/init.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • airbyte_cdk/manifest_migrations/exceptions.py
  • unit_tests/manifest_migrations/test_manifest_migration.py
  • unit_tests/manifest_migrations/conftest.py
🧰 Additional context used
🧠 Learnings (1)
airbyte_cdk/manifest_migrations/README.md (1)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
🧬 Code Graph Analysis (4)
airbyte_cdk/manifest_migrations/manifest_migration.py (3)
airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py (3)
  • should_migrate (24-27)
  • migrate (29-31)
  • validate (33-41)
airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py (3)
  • should_migrate (27-30)
  • migrate (32-47)
  • validate (49-57)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (3)
  • should_migrate (24-27)
  • migrate (29-33)
  • validate (35-41)
airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py (3)
airbyte_cdk/manifest_migrations/manifest_migration.py (4)
  • ManifestMigration (38-134)
  • should_migrate (50-57)
  • migrate (60-65)
  • validate (68-73)
airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py (3)
  • should_migrate (24-27)
  • migrate (29-31)
  • validate (33-41)
airbyte_cdk/sources/types.py (1)
  • keys (137-138)
airbyte_cdk/manifest_migrations/migrations_registry.py (1)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
  • ManifestMigration (38-134)
airbyte_cdk/manifest_migrations/migration_handler.py (2)
airbyte_cdk/manifest_migrations/exceptions.py (1)
  • ManifestMigrationException (6-12)
airbyte_cdk/manifest_migrations/manifest_migration.py (4)
  • ManifestMigration (38-134)
  • MigrationTrace (22-35)
  • _process_manifest (94-134)
  • as_dict (34-35)
🪛 GitHub Actions: Linters
airbyte_cdk/manifest_migrations/manifest_migration.py

[error] 34-34: mypy: Missing type parameters for generic type "dict" [type-arg]

airbyte_cdk/manifest_migrations/migrations_registry.py

[error] 24-24: mypy: Missing return statement [return]


[error] 39-39: mypy: Function is missing a type annotation for one or more arguments [no-untyped-def]


[error] 56-56: mypy: Need type annotation for "migrations" (hint: "migrations: dict[, ] = ...") [var-annotated]

airbyte_cdk/manifest_migrations/migration_handler.py

[error] 89-89: mypy: "ManifestMigration" has no attribute "name" [attr-defined]

🪛 LanguageTool
airbyte_cdk/manifest_migrations/README.md

[style] ~27-~27: Consider using the synonym “brief” (= concise, using a few words, not lasting long) to strengthen your wording.
Context: ...for the version - description: A short description of the migration Exampl...

(QUICK_BRIEF)

🪛 YAMLlint (1.35.1)
airbyte_cdk/manifest_migrations/migrations/registry.yaml

[error] 7-7: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (8)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)

75-93: Guard against missing type key to avoid KeyError?

_is_migratable_type indexes obj[TYPE_TAG], assuming the key is present.
Although callers currently check _is_component first, a direct external call could bypass that and raise.

Would adding a defensive TYPE_TAG in obj check make the helper safer, or do you consider the current contract sufficient?
Just raising the question for clarity. 🙂

airbyte_cdk/manifest_migrations/README.md (7)

1-4: Clear introduction.
This section concisely explains the purpose of the directory. Nice!


5-12: Creating a new migration.
Instructions here are clear and straightforward. Looks good!


13-20: Defining the migration class.
Good description of the required methods to implement.


29-38: Registry YAML example.
Example is well-structured and follows expected formatting.


44-49: Migration discovery.
This section succinctly explains how migrations are discovered and when to avoid manual registry edits.


50-69: Example migration skeleton.
The skeleton is clear, and the import path matches the actual module location. Great!


71-73: Further references.
Linking to docstrings and examples is helpful for readers.

@bazarnov bazarnov requested a review from maxi297 April 23, 2025 16:28
Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this one. Just a nit comment regarding updating the version.

Also nit: I guess I'm not too worried about this because we shouldn't maintain the migrations but maybe for the next ones we should have tests to provide documentation about the cases is supports

Copy link
Contributor

@bnchrch bnchrch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Maxime is happy Im happy :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants