-
Notifications
You must be signed in to change notification settings - Fork 976
[Variant] Add VariantBuilder::new_with_buffers
to write to existing buffers
#7912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1916,6 +1989,80 @@ mod tests { | |||
assert_eq!(metadata.num_field_names(), 3); | |||
} | |||
|
|||
/// Test reusing buffers with nested objects | |||
#[test] | |||
fn test_with_existing_buffers_nested() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a really cool test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have loved to append the variant but I hit this bug:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you apply this diff, it should give you what you need: #7914.
I don't love that we need to use a builder to reencode the object/list. I'm going to spend some time thinking about a more elegant solution.
I wonder if this PR with its existing buffers can help 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you apply this diff, it should give you what you need: #7914.
Thank you @friendlymatthew -- that looks great ❤️ -- I left a few comments
I don't love that we need to use a builder to reencode the object/list. I'm going to spend some time thinking about a more elegant solution.
I think potentially a more performant solution would be to optimize #7914 to copy the Variant bytes directly and then update the field_ids.
However, that will get tricky I think if the field_ids in the variant under construction are different sizes than the old one.
I suggest we start with the simple thing (what you have) and then we can optimize if/when we have benchmarks that show it would help
Which issue does this PR close?
Rationale for this change
I would like to be able to write Variants directly into the target buffer when writing multiple variants However, the current VariantBuilder allocates a new bufffer for each variant
What changes are included in this PR?
VariantBuilder::new_with_buffers
and docs and testsYou can see how this API can be used to write directly into a buffer in VariantArrayBuilder in this PR:
Are these changes tested?
Yes new tests
Are there any user-facing changes?
New API