[Bug]: Table output - bad performance in transaction-based pipelines #4680

dave-csc · 2024-12-09T10:15:12Z

Apache Hop version?

2.10.0

Java version?

17.0.2

Operating system

Linux

What happened?

Following discussion #4678, here's a use case:

Set a transform that loads some thousands of rows (i.e. with a Table input from a database, or with a Text file input from a text file)
Link this transform to a Table output, and set it to truncate the table (in any case, not just if a row comes in)

Start the pipeline in the default local run configuration, without transactions enabled: the execution time is somehow reasonable.

Start the pipeline in a run configuration that includes transactions: the pipeline works as expected, and it's correctly rollbacked in case of errors, but the execution times worsen a lot (even 60 times compared to the previous case).

For a data sample, you can try to download the GeoLite2 City CSV Database from Maxmind (you need to signup to Maxmind, the GeoLite2 database is free of charge).

Issue Priority

Priority: 2

Issue Component

Component: Transforms

dave-csc added awaiting triage bug labels Dec 9, 2024

github-actions bot added P2 Default Priority Transforms labels Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Table output - bad performance in transaction-based pipelines #4680

[Bug]: Table output - bad performance in transaction-based pipelines #4680

dave-csc commented Dec 9, 2024

[Bug]: Table output - bad performance in transaction-based pipelines #4680

[Bug]: Table output - bad performance in transaction-based pipelines #4680

Comments

dave-csc commented Dec 9, 2024

Apache Hop version?

Java version?

Operating system

What happened?

Issue Priority

Issue Component