-
Notifications
You must be signed in to change notification settings - Fork 17
feat(cdk): connector builder support for file uploader #503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThis update introduces a modular file uploading architecture by defining new abstract base classes ( Changes
Sequence Diagram(s)sequenceDiagram
participant ModelToComponentFactory
participant DefaultFileUploader
participant ConnectorBuilderFileUploader
participant FileWriter
participant Record
ModelToComponentFactory->>DefaultFileUploader: Instantiate with FileWriter
ModelToComponentFactory->>ConnectorBuilderFileUploader: (if emit_connector_builder_messages) Wrap DefaultFileUploader
Record->>DefaultFileUploader: upload(record)
DefaultFileUploader->>FileWriter: write(file_path, content)
FileWriter-->>DefaultFileUploader: file_size
DefaultFileUploader-->>Record: Update file_reference
ConnectorBuilderFileUploader->>DefaultFileUploader: upload(record)
ConnectorBuilderFileUploader->>Record: Copy file_reference fields into data
Would you like to see a more detailed breakdown of the new test logic or perhaps a diagram focusing on the decorator pattern used for the connector builder uploader? Wdyt? 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (9)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (6)
airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (1)
10-15
: Implementation looks good!The
NoopFileWriter
correctly implements the abstract method fromBaseFileWriter
and returns 0 to indicate no bytes were written, which aligns with its purpose to avoid actual file operations.Consider enhancing the method docstring to explain behavior rather than class purpose:
def write(self, file_path: Path, content: bytes) -> int: """ - Noop file writer + No-operation implementation that returns 0 bytes written without + writing any content to the file system. """ return 0airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (1)
10-18
: Implementation looks good but lacks error handlingThe
FileWriter
correctly implements writing bytes to a file and returns the size. However, it doesn't handle potential file I/O exceptions that might occur (disk full, permission issues, etc).Would adding some basic error handling provide better diagnostics, wdyt?
def write(self, file_path: Path, content: bytes) -> int: """ Writes the file to the specified location """ - with open(str(file_path), "wb") as f: - f.write(content) - - return file_path.stat().st_size + try: + with open(str(file_path), "wb") as f: + f.write(content) + + return file_path.stat().st_size + except (IOError, OSError) as e: + logger.error(f"Failed to write file to {file_path}: {str(e)}") + raiseairbyte_cdk/sources/declarative/retrievers/file_uploader/__init__.py (1)
1-15
: Good module exportsThe package initialization file correctly exports all relevant classes. One minor suggestion - would it make sense to organize the
__all__
list in the same order as the imports for consistency and readability, wdyt?__all__ = [ - "FileUploader", - "FileWriter", - "NoopFileWriter", - "ConnectorBuilderFileUploader", - "BaseFileUploader", - "BaseFileWriter", + "BaseFileUploader", + "BaseFileWriter", + "ConnectorBuilderFileUploader", + "FileUploader", + "FileWriter", + "NoopFileWriter", ]airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_writer.py (1)
14-19
: Could we clarify the return value in the docstring?The
write
method returns an integer but doesn't specify what this integer represents. Could we update the docstring to clarify that this is the number of bytes written? This would make the contract more explicit for implementers, wdyt?def write(self, file_path: Path, content: bytes) -> int: """ Writes the file to the specified location + :return: The number of bytes written """ ...
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (1)
90-90
: Should we add error handling around file writing?The file writing operation could fail for various reasons (disk full, permissions, etc.). Would it make sense to add some error handling here? Maybe catch and log specific exceptions or wrap them in a more descriptive custom exception, wdyt?
- file_size_bytes = self.file_writer.write(full_path, content=response.content) + try: + file_size_bytes = self.file_writer.write(full_path, content=response.content) + except Exception as e: + logger.error(f"Failed to write file to {full_path}: {e}") + raise ValueError(f"Failed to upload file: {str(e)}") from eairbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py (1)
22-26
: Could we add a docstring to the upload method and handle potential edge cases?The implementation looks good, but would you consider adding a docstring to explain the behavior more explicitly? Also, what happens if
file_reference
is None? Could we add a quick check to handle that case, wdyt?def upload(self, record: Record) -> None: + """ + Uploads a file using the wrapped file_uploader and copies file_reference attributes to record.data. + + This enables connector builder to communicate about files without writing them. + """ self.file_uploader.upload(record=record) + if record.file_reference is None: + return for file_reference_key, file_reference_value in record.file_reference.__dict__.items(): if not file_reference_key.startswith("_"): record.data[file_reference_key] = file_reference_value # type: ignore
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(3 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/__init__.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_uploader.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_writer.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py
(2 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py
(1 hunks)airbyte_cdk/sources/declarative/yaml_declarative_source.py
(2 hunks)unit_tests/sources/declarative/file/test_file_stream.py
(5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (7)
airbyte_cdk/sources/declarative/retrievers/file_uploader/__init__.py (6)
airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_uploader.py (1)
BaseFileUploader
(12-22)airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_writer.py (1)
BaseFileWriter
(9-19)airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py (1)
ConnectorBuilderFileUploader
(14-26)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (1)
FileUploader
(32-101)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (1)
FileWriter
(10-18)airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (1)
NoopFileWriter
(10-15)
airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (2)
airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_writer.py (2)
BaseFileWriter
(9-19)write
(15-19)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (1)
write
(11-18)
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (2)
airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_writer.py (2)
BaseFileWriter
(9-19)write
(15-19)airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (1)
write
(11-15)
airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_uploader.py (3)
airbyte_cdk/sources/types.py (1)
Record
(21-72)airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py (1)
upload
(22-26)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (1)
upload
(55-101)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (6)
airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_uploader.py (1)
BaseFileUploader
(12-22)airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py (1)
ConnectorBuilderFileUploader
(14-26)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (1)
FileUploader
(32-101)airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
FileUploader
(2069-2091)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (1)
FileWriter
(10-18)airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (1)
NoopFileWriter
(10-15)
airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py (3)
airbyte_cdk/sources/types.py (2)
Record
(21-72)data
(35-36)airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_uploader.py (2)
BaseFileUploader
(12-22)upload
(18-22)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (2)
FileUploader
(32-101)upload
(55-101)
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (5)
airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_uploader.py (1)
BaseFileUploader
(12-22)airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_writer.py (2)
BaseFileWriter
(9-19)write
(15-19)airbyte_cdk/sources/declarative/requesters/requester.py (1)
Requester
(29-144)airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (1)
write
(11-15)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (1)
write
(11-18)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Analyze (python)
🔇 Additional comments (10)
airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_uploader.py (1)
11-22
: Solid abstract base class definitionThe
BaseFileUploader
abstract base class is well-defined with a clear contract for subclasses. The@dataclass
annotation together with the abstract method provides a good foundation for file uploader implementations.airbyte_cdk/sources/declarative/yaml_declarative_source.py (1)
27-27
: LGTM! Clean parameter passing implementationThe parameter addition and passing to the superclass looks good. The
or False
fallback ensures we always have a boolean value even ifNone
is provided.Also applies to: 40-40
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (2)
25-37
: LGTM! Clean abstraction implementationThe inheritance from
BaseFileUploader
and the comprehensive docstring explaining the class's purpose and behavior look great.
42-42
: LGTM! Proper dependency injectionUsing the abstract
BaseFileWriter
type for dependency injection follows good design principles, allowing for different file writing strategies to be injected.airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py (1)
13-20
: LGTM! Well-designed decorator pattern implementationThe class follows the decorator pattern well, wrapping a
FileUploader
instance and extending its behavior without modifying it. The docstring clearly explains its purpose.airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
484-490
: Import statement expanded to support file upload abstractionThese import changes are part of the new file upload abstraction layer. Nice refactoring to support the connector builder mode with specialized classes for different responsibilities.
3601-3601
: Return type updated to use abstraction instead of concrete implementationChanging the return type from
FileUploader
toBaseFileUploader
follows good OOP principles by programming to an interface. This allows different implementations to be returned based on the context, which is exactly what you're doing here. Nice job!
3615-3629
: Conditional file upload behavior based on connector builder modeThe implementation properly branches based on whether connector builder messages are enabled:
- Uses
NoopFileWriter
to prevent actual file writing in connector builder mode- Wraps the
FileUploader
withConnectorBuilderFileUploader
to copy file reference metadata into record dataThis looks like a clean implementation of the feature described in the PR objectives. The connector builder can now communicate about files without writing them.
unit_tests/sources/declarative/file/test_file_stream.py (2)
37-37
: Added emit_connector_builder_messages parameter to helper functionsThe test helper functions have been updated to propagate the new
emit_connector_builder_messages
parameter through to theYamlDeclarativeSource
constructor. This is essential for testing both modes of operation.Also applies to: 46-46, 56-57, 61-65
200-243
: Added test for connector builder file upload functionalityExcellent test coverage for the new connector builder file upload functionality. The test:
- Verifies file references are created without actual file writing (
file_size_bytes == 0
)- Confirms file reference attributes are properly copied into the record data
- Tests the complete flow with mocked HTTP requests
This test aligns perfectly with the PR objectives and provides good verification of the feature.
airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py
Outdated
Show resolved
Hide resolved
…file_writer.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
/autofix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of comments but overall it looks like we are on track
@@ -24,6 +24,7 @@ def __init__( | |||
catalog: Optional[ConfiguredAirbyteCatalog] = None, | |||
config: Optional[Mapping[str, Any]] = None, | |||
state: Optional[List[AirbyteStateMessage]] = None, | |||
emit_connector_builder_messages: Optional[bool] = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. I guess this isn't needed yet (until we do #460) but we wanted to preemptively set this up, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just needed a better patching, so there's no need to make any changes for now in this class. Removed this.
from pathlib import Path | ||
|
||
|
||
class BaseFileWriter(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base
does not seem in our vocabulary for interfaces. Could we have something like this?
- Parent: FileWriter
- LocalFileSystemFileWriter
- NoopFileWriter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, i have fixed names to match our vocabulary.
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/retrievers/file_uploader/base_file_uploader.py
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py
Outdated
Show resolved
Hide resolved
/autofix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.py (1)
1-1
: 💡 Verification agent🧩 Analysis chain
Formatting issue flagged by linter
The pipeline shows a Ruff formatting error. Would you mind running the formatter to fix this?
🏁 Script executed:
#!/bin/bash # Description: Show what formatting issues Ruff is detecting ruff format --diff airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.pyLength of output: 858
Apply Ruff formatting to default_file_uploader.py
It looks like Ruff wants to split that long
NotImplementedError
call into multiple lines. Could you apply the formatter to fix this? wdyt?
- File:
airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.py
at line 72Suggested diff:
@@ -72,7 +72,9 @@ if self.content_extractor: - raise NotImplementedError("Content extraction is not yet implemented. The content_extractor component is currently not supported.") + raise NotImplementedError( + "Content extraction is not yet implemented. The content_extractor component is currently not supported." + )Maybe run:
ruff format --fix airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.pyto apply these changes?
🧰 Tools
🪛 GitHub Actions: Linters
[error] 1-1: Ruff formatting check failed. File would be reformatted.
🧹 Nitpick comments (3)
airbyte_cdk/sources/declarative/retrievers/file_uploader/local_file_system_file_writer.py (1)
10-18
: Implementation looks good, but should we add error handling?The file writing and size calculation logic is correct, but I'm wondering if we should add some error handling for common file operations issues? If a file can't be written due to permissions or disk space issues, this will propagate exceptions directly to the caller - is that intentional behavior?
The
stat()
call could also fail if something happens between writing and checking the size. What do you think about adding a try/except block, wdyt?def write(self, file_path: Path, content: bytes) -> int: """ Writes the file to the specified location """ - with open(str(file_path), "wb") as f: - f.write(content) - - return file_path.stat().st_size + try: + with open(str(file_path), "wb") as f: + f.write(content) + + return file_path.stat().st_size + except (IOError, OSError) as e: + # Log the error with more context + logging.error(f"Failed to write file {file_path}: {str(e)}") + raiseairbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.py (2)
32-37
: Great job on splitting file uploading from file writing!I like the refactoring that introduces the abstract base class hierarchy. The docstring explains the basic responsibility well, but could we maybe expand it to clarify how this fits into the connector builder pattern? Perhaps mention when you'd use different writer implementations? This would help future devs understand the design choices better, wdyt?
75-75
: Error message is clear, but what about alternatives?The error message is improved, but could we perhaps add a hint about what approaches users should take instead of using content extraction? Something like "Please use X approach instead" would be super helpful for developers hitting this error, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
airbyte_cdk/sources/declarative/extractors/record_selector.py
(2 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(4 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/__init__.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.py
(3 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/local_file_system_file_writer.py
(1 hunks)airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py
(1 hunks)unit_tests/sources/declarative/file/test_file_stream.py
(5 hunks)
✅ Files skipped from review due to trivial changes (2)
- airbyte_cdk/sources/declarative/extractors/record_selector.py
- airbyte_cdk/sources/declarative/retrievers/file_uploader/init.py
🚧 Files skipped from review as they are similar to previous changes (3)
- airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py
- unit_tests/sources/declarative/file/test_file_stream.py
- airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🧰 Additional context used
🧬 Code Graph Analysis (4)
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (3)
airbyte_cdk/sources/types.py (1)
Record
(21-72)airbyte_cdk/sources/declarative/retrievers/file_uploader/connector_builder_file_uploader.py (1)
upload
(22-26)airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.py (1)
upload
(55-101)
airbyte_cdk/sources/declarative/retrievers/file_uploader/local_file_system_file_writer.py (2)
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (2)
FileWriter
(9-19)write
(15-19)airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (1)
write
(12-16)
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (2)
airbyte_cdk/sources/declarative/retrievers/file_uploader/local_file_system_file_writer.py (1)
write
(11-18)airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (1)
write
(12-16)
airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (2)
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (2)
FileWriter
(9-19)write
(15-19)airbyte_cdk/sources/declarative/retrievers/file_uploader/local_file_system_file_writer.py (1)
write
(11-18)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py
[error] 1-1: Ruff formatting check failed. File would be reformatted.
airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.py
[error] 1-1: Ruff formatting check failed. File would be reformatted.
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Analyze (python)
🔇 Additional comments (6)
airbyte_cdk/sources/declarative/retrievers/file_uploader/noop_file_writer.py (2)
5-7
: Looks good, but should we be more explicit with the import?The import from
.file_writer
is correct, but a previous reviewer suggested importing directly from.base_file_writer
instead. Since we appear to have moved on with this approach, I'm fine with it, but want to ensure this was intentional.
10-12
: Good call on setting NOOP_FILE_SIZE to -1Setting this to -1 makes it very clear in the ConnectorBuilder UI that this is a test/placeholder value rather than a real file size of 0 bytes. This addresses a previous reviewer's question about making it obvious this is a Connector Builder test.
airbyte_cdk/sources/declarative/retrievers/file_uploader/file_writer.py (1)
9-19
: Clean abstract interface definitionThe abstract base class is well-defined with an appropriate method signature and clear docstring. The return type of
int
to represent the file size in bytes makes sense and aligns with both implementations.airbyte_cdk/sources/declarative/retrievers/file_uploader/file_uploader.py (1)
11-22
: Well-designed abstract interface with @DataClassThe abstract class definition with the dataclass decorator looks good. This will allow concrete implementations to easily define and initialize their properties.
Your method signature for upload() correctly handles a Record object and updates its file_reference property (based on the implementation in the concrete classes I can see in the snippets).
airbyte_cdk/sources/declarative/retrievers/file_uploader/default_file_uploader.py (2)
42-42
: Looks good - type-safe dependency injectionNice implementation of dependency injection for the file writer. This abstraction will make testing much easier!
90-90
: Clean delegation to file writer - nice!The implementation correctly delegates to the injected file writer. This makes the code more flexible and testable.
What
We want to implement support for the connector builder to communicate about the files that will be uploaded for the streams.
Resolves https://github.com/airbytehq/airbyte-internal-issues/issues/12424
How
Current Implementation:
emit_connector_builder_messages=False
):FileUploader
is instantiated with a standardFileWriter
, which performs the actual file writing.emit_connector_builder_messages=True
):FileUploader
is instantiated with aNoopFileWriter
, preventing actual file writes.FileUploader
instance is then wrapped byConnectorBuilderFileUploader
.ConnectorBuilderFileUploader.upload()
first calls the wrappedFileUploader.upload()
(which populatesrecord.file_reference
but doesn't write the file) and then performs the Connector Builder-specific action of copyingfile_reference
attributes intorecord.data
.Summary by CodeRabbit
New Features
Bug Fixes