Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 12 additions & 5 deletions velox/dwio/parquet/writer/Writer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,18 @@ std::shared_ptr<::arrow::Field> updateFieldNameRecursive(
arrowMapType->item_field(), *mapType.valueType());
return newField->WithType(
::arrow::map(newKeyField->type(), newValueField->type()));
} else if (name != "") {
} else if (type.isDecimal()) {
// Parquet type is set from the column type rather than inferred from the
// field data.
auto precisionScale = getDecimalPrecisionScale(type);
if (!name.empty()) {
auto newField = field->WithName(name);
return newField->WithType(
::arrow::decimal(precisionScale.first, precisionScale.second));
}
return field->WithType(
::arrow::decimal(precisionScale.first, precisionScale.second));
} else if (!name.empty()) {
return field->WithName(name);
} else {
return field;
Expand Down Expand Up @@ -437,10 +448,6 @@ dwio::common::StripeProgress getStripeProgress(
* This method assumes each input `ColumnarBatch` have same schema.
*/
void Writer::write(const VectorPtr& data) {
VELOX_USER_CHECK(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this too strong to remove this here? So this would trigger because the type is DECIMAL(8,2) but the schema type is DECIMAL(6,2)? For other types we may not have this problem. Perhaps for DECIMAL we need a separate chat to check that we allow a larger precision?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check is done in the Java code before it comes here so I believe it should be safe to remove this check here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should validate in Velox as well since it's a general-purpose library.
For decimals, it should be enough to check if the data decimal value fits in the decimal table type.

data->type()->equivalent(*schema_),
"The file schema type should be equal with the input rowvector type.");

VectorPtr exportData = data;
if (needFlatten(exportData)) {
BaseVector::flattenVector(exportData);
Expand Down
Loading