You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As far as I can tell the replace write_disposition only supports replacing an entire table, I would like to be able to replace a single partition within a table.
I can do this with a merge disposition with a merge key, but it does a lot more than I want/need. It creates temp tables and stages data and all that stuff, when I simply want to just delete and re-insert data, I am aware this means there is a non-zero period of time the data is not available.
Are you a dlt user?
Yes, I'm already a dlt user.
Use case
I can use the merge write_disposition but it does a bunch of stuff I don't really need and can be quite slow and expensive (when running on Snowflake anyways). I looked at using upsert but it doesn't handle the case of existing entries being deleted if they no longer exist in the source (afaict).
Proposed solution
Example:
Event
id
type
match_id
id
type
match_id
1
pass
123
2
shot
123
3
pass
456
Replace everything in Event with the match_id 123 would delete events 1 and 2 and re-add them, but event with id 3 would remain untouched.
@jenkoian I depends a bit what kind of data is incoming, but let's say you are running the pipeline and you know the incoming data is only for one specific partition, you can simply just drop the partition in the destination before the load and then to an append load to that table. You have access to the sql_client of the destination via the pipeline. Or which destination are you using? Dropping a partition is a simple as running "DELETE FROM table_name WHERE partition=value" in most cases.
Feature description
As far as I can tell the replace write_disposition only supports replacing an entire table, I would like to be able to replace a single partition within a table.
I can do this with a merge disposition with a merge key, but it does a lot more than I want/need. It creates temp tables and stages data and all that stuff, when I simply want to just delete and re-insert data, I am aware this means there is a non-zero period of time the data is not available.
Are you a dlt user?
Yes, I'm already a dlt user.
Use case
I can use the merge write_disposition but it does a bunch of stuff I don't really need and can be quite slow and expensive (when running on Snowflake anyways). I looked at using
upsert
but it doesn't handle the case of existing entries being deleted if they no longer exist in the source (afaict).Proposed solution
Example:
Replace everything in
Event
with thematch_id
123 would delete events 1 and 2 and re-add them, but event with id 3 would remain untouched.Related issues
I think this is similar? #1094
The text was updated successfully, but these errors were encountered: