IBM watsonx.data destination connector: improve performance with partitioning, metadata cleanup, and increased retries #606

Paul-Cornell · 2025-04-23T15:34:26Z

See the updated CREATE TABLE statement, the new Python script, and additions to Max Connection Retries and Max Retries, in https://unstructured-53-ibm-watsonxdata-2025-04-22.mintlify.app/ui/destinations/ibm-watsonxdata

…e partitioning, periodic metadata cleanup, and increased connection retries

mpolomdeepsense · 2025-04-24T12:02:04Z

snippets/general-shared-text/ibm-watsonxdata.mdx

         "id" varchar,
         "record_id" varchar,
         "parent_id" varchar
      )
      WITH (
         delete_mode = 'copy-on-write',
         format = 'PARQUET',
-         format_version = '2'
+         format_version = '2',
+         partitioning = ARRAY['record_id']


@potter-potter We decided not to use partitioning right? We can probably remove this line.

But still I think we should include a section about partitioning, it could simply just say what partitioning is used for and that it can slightly increase the performance (3-4 sentences). And also add a link to docs about partitioning. and to Presto docs.

@potter-potter bump

mpolomdeepsense · 2025-04-24T12:04:29Z

snippets/general-shared-text/ibm-watsonxdata.mdx

         format = 'PARQUET',
-         format_version = '2'
+         format_version = '2',
+         partitioning = ARRAY['record_id']


We should probably state somewhere that this SQL command is using Presto SQL syntax.
https://prestodb.io/docs/current/connector/iceberg.html

IBM watsonx.data destination connector: improve performance with tabl…

3a473ed

…e partitioning, periodic metadata cleanup, and increased connection retries

mintlify bot deployed to staging April 23, 2025 15:37 View deployment

Added Python script and retry settings

c1778a2

Paul-Cornell requested review from MKhalusova, mpolomdeepsense, tarunnv-uio and ajay23-uns April 23, 2025 20:16

Paul-Cornell marked this pull request as ready for review April 23, 2025 20:16

mintlify bot deployed to staging April 23, 2025 20:16 View deployment

mpolomdeepsense reviewed Apr 24, 2025

View reviewed changes

Paul-Cornell requested a review from potter-potter April 24, 2025 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IBM watsonx.data destination connector: improve performance with partitioning, metadata cleanup, and increased retries #606

IBM watsonx.data destination connector: improve performance with partitioning, metadata cleanup, and increased retries #606

Paul-Cornell commented Apr 23, 2025 •

edited

Loading

mpolomdeepsense Apr 24, 2025 •

edited

Loading

mpolomdeepsense Apr 28, 2025

mpolomdeepsense Apr 24, 2025

IBM watsonx.data destination connector: improve performance with partitioning, metadata cleanup, and increased retries #606

Are you sure you want to change the base?

IBM watsonx.data destination connector: improve performance with partitioning, metadata cleanup, and increased retries #606

Conversation

Paul-Cornell commented Apr 23, 2025 • edited Loading

mpolomdeepsense Apr 24, 2025 • edited Loading

Choose a reason for hiding this comment

mpolomdeepsense Apr 28, 2025

Choose a reason for hiding this comment

mpolomdeepsense Apr 24, 2025

Choose a reason for hiding this comment

Paul-Cornell commented Apr 23, 2025 •

edited

Loading

mpolomdeepsense Apr 24, 2025 •

edited

Loading