-
Notifications
You must be signed in to change notification settings - Fork 3.9k
GH-45722: [C++] Add UnsafeAppend methods to StructBuilder #47522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
GH-45722: [C++] Add UnsafeAppend methods to StructBuilder #47522
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or See also: |
|
Could you use our PR template instead of removing it entirely? Could you fix lint failures? You can enable GitHub Actions on your fork to check CI results on your fork. |
cpp/src/arrow/array/builder_nested.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// This method is "unsafe" because it does not update the null bitmap | |
| /// This method is "unsafe" because it does not update the null bitmap. |
cpp/src/arrow/array/builder_nested.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| void UnsafeAppend(){ | |
| void UnsafeAppend(bool is_valid = true) { |
cpp/src/arrow/array/builder_nested.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ++length_; | |
| UnsafeAppendToBitmap(is_valid); |
Could you use UnsafeAppend(is_valid) in Append() instead of UnsafeAppendToBitmap(is_valid)?
cpp/src/arrow/array/builder_nested.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to append an empty value not null.
Ah, we may need UnsafeAppendEmptyValue() before we work on this. @pitrou What do you think about this?
cpp/src/arrow/array/builder_nested.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| length_++; | |
| UnsafeAppend(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently working on lint issues. I will commit the changes once these issues are resolved.
c629dca to
a8b8faf
Compare
@kou, I am having the entire project and all the tests compile successfully. The final build log confirms that arrow-array-test.exe is being created correctly. The core of the issue is that the build system doesn't seem to be automatically discovering the tests in array_struct_test.cc. While my specific test passes, I've discovered that my change to the CMakeLists.txt file seems to break the main test suite. When I run a full ctest, the main arrow-array-test target (which I believe is supposed to contain all the array tests) now fails. Here is the log from the full run: |
|
Does |
|
No, its not working. all it say is |
1020ebc to
5bc5f08
Compare
|
OK. #47434 may be related. Could you rebase on main? |
f3873a6 to
9d3abe5
Compare
|
Please checkout my recent changes. I have resolved the lint errors and it’s passing all the workflows in my fork repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use our PR template instead of removing it entirely?
|
I've updated the PR according to the template and committed the suggested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@pitrou I think that we should add UnsafeAppendEmptyValue() before this. What do you think about it?
cpp/src/arrow/array/builder_nested.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "does not update the null bitmap"? Below it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, my approach was different. But that didn't work out. I will remove this comment in my next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kou is right, we should add UnsafeAppendEmptyValue. This can be done in this PR, or in a separate one.
|
@pitrou I'll update this PR with updated comments and |
Co-authored-by: Sutou Kouhei <[email protected]>
|
@kou , @pitrou
Or implement a basic version for now and leave the full concrete implementations for a separate PR? |
f56cffb to
bf73f86
Compare
I think Note you can recycle the existing implementations of Status ArrayBuilder::AppendEmptyValue() {
RETURN_NOT_OK(Reserve(1));
UnsafeAppendEmptyValue();
return Status::OK();
}
Status ArrayBuilder::AppendEmptyValues(int64_t length) {
RETURN_NOT_OK(Reserve(length));
UnsafeAppendEmptyValues(length);
return Status::OK();
} |
Rationale for this change
StructBuilder class was missing UnsafeAppend() , UnsafeAppendNull(), UnsafeAppendNulls() methods that were standard to other builders.
What changes are included in this PR?
Added three new public methods to the StructBuilder class in cpp/src/arrow/array/builder_nested.h:
void UnsafeAppend()
Status UnsafeAppendNull()
Status UnsafeAppendNulls(int64_t length)
Refactored the existing Append() methods to use a centralized private UnsafeAppend(bool is_valid) function, improving maintainability and reducing code duplication, based on reviewer feedback.
Are these changes tested?
Yes. A new test suite, TestStructBuilderUnsafe, has been added to cpp/src/arrow/array/array_struct_test.cc. This suite contains three new unit tests, one for each of the new unsafe methods, which validate their correctness and handling of nulls. All new and existing tests are passing locally.
Are there any user-facing changes?
Yes, this PR introduces new public methods to the StructBuilder API. The changes are purely additive and do not alter or remove any existing functionality, so they are not breaking.