Skip to content

Support for S3 catalog to work with S3 Tables #1404

Open
@nicor88

Description

@nicor88

Feature Request / Improvement

Amazon S3 tables have being launched, see this, and looks like that S3 tables have a managed iceberg catalog.

Based on https://github.com/awslabs/s3-tables-catalog it looks like that AWS build an S3 catalog wrapper using java, that can be used by query engines like Spark/Trino.
It will be relevant to write to S3 tables via pyiceberg.

More context

Based on my understanding, once an S3 table is created, iceberg metadata are not initialized.
For a freshly created table, it's possible to retrieve the warehouseLocation -> see get_table.
The warehouseLocation looks like a unique S3 bucket, where you can put S3 objects in it.
After putting the S3 objects of an iceberg commit operation: data+metadata, it's possible to use update_table_metadata_location to point the S3 table to the right location.

Note: I'm not 100% sure on the above - and I need to validate it via some tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions