-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-28956 : Use msdb.alterPartitions() API and implement batching for alter table add column cascade command #5814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
eb1c3b1
to
db2dd7b
Compare
db2dd7b
to
c754743
Compare
Table finalOldt = oldt; | ||
int partitionBatchSize = MetastoreConf.getIntVar(handler.getConf(), | ||
MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX); | ||
Batchable.runBatched(partitionBatchSize, parts, new Batchable<Partition, Void>() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR naming is confusing. Where is the direct sql here? This PR introduces batching for partition metadata update, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previously used msdb.alterPartition was only using JDO. The newly added msdb.alterPartitions API is based on DirectSql.
So this PR is introducing Batching and also using direct sql based msdb.alterPartitions API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the direct sql here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deniskuzZ , PR naming is confusing, i will update the same from my end. Thanks for pointing it out in your earlier messages.
tableName, part.getValues(), oldCols, finalOldt, part, null, null); | ||
assert (colStats.isEmpty()); | ||
oldPart.setParameters(part.getParameters()); | ||
oldParts.add(oldPart); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you constructing oldParts
here if it's only used when cascade
is false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it
926a4c7
to
2a5cc01
Compare
|
@deniskuzZ , i have made the necessary changes. Could you please check the PR once again? |
What changes were proposed in this pull request?
Currently Alter table add column cascade uses msdb.alterPartition API. This API only JDO based implementation. Instead of this use msdb.alterPartitions which supports both DirectSql and JDO Implementation and use batching as well.
Why are the changes needed?
DirectSql bein more optimized than JDO can improve performance for tables with very high number of partitions and very high number of columns.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Tested this on a 3 node cluster on a table with 5000+ partitions and 900+ columns.