You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1.) __dbt_tmp is created;
2.) data in __dbt_tmp is inserted into destination table.
It is all good, until I add a partition_by to the model.
What I want is to have partition in the destination table. However, I see that when doing incremental load, the __dbt_tmp is also partitioned by the column.
It caused the undesirable effect where my incremental data in __dbt_tmp got split into a lot of partitions (because of partition_by). When I load data from __dbt_tmp into the destination table, it took a long time.
For the incremental load, I only load last day's data. And I want partition to be done on another key in the destination table .
I still don't quite understand why the partition_by is applied on the __dbt_tmp (source table), instead of the actual model I am creating (i.e. the destination table)
I did stumbled on the same issue, and it seems that the partition_by is not at all applied to the destination table.
@kenho811, did you get partitioning on the destination at all or only on __dbt_tmp?
If @kenho811 can confirm that partitioning is only applied on __dbt_tmp and not at the destination table at all, I'd rather call it a bug than an enhancement, as the documentation cleary states that it should be possible to get partitioning on the incremental model and it won't be possible to create incremental models with partitioning at all.
Describe the enhancement requested
I observe that when running a incremental model,
1.) __dbt_tmp is created;
2.) data in __dbt_tmp is inserted into destination table.
It is all good, until I add a partition_by to the model.
What I want is to have partition in the destination table. However, I see that when doing incremental load, the __dbt_tmp is also partitioned by the column.
It caused the undesirable effect where my incremental data in __dbt_tmp got split into a lot of partitions (because of partition_by). When I load data from __dbt_tmp into the destination table, it took a long time.
For the incremental load, I only load last day's data. And I want partition to be done on another key in the destination table .
Is this possible?
=========
Example
I have the below model's config
dbt-dremio created the below __dbt_tmp
I believe this is unnecessary?
Should this partition by only applied to the destination table? (and not the __dbt_tmp) ??
The text was updated successfully, but these errors were encountered: