Skip to content

Iceberg compaction doesn't parallelize well across partitions #26571

@itamarwe

Description

@itamarwe

I'm performing compaction for my Iceberg table and Trino doesn't parallelize well. I see short spikes of high parallelization followed by a single worker working most of the time.
My table is rather skewed. I have a few partitions with many small files that require compaction, while most of the partitions are well compacted.
I would have expected Trino to parallelize the compaction, but is it possible that it doesn't parallelize between partitions?
(I'm using mostly default configuration. And my table is sorted)

EDIT: I'm not sure it even parallelizes between partitions, as even though I have several partitions that require significant compaction I see no more than a single worker working on the compaction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions