-
Notifications
You must be signed in to change notification settings - Fork 224
Investigate how to reduce size of DataType
or arrays without making the API too awkward
#439
Comments
DataType
or arrays.DataType
or arrays without making the API too poor
DataType
or arrays without making the API too poorDataType
or arrays without making the API too awkward
Do we need to consider or measure the overhead for accessing stack v.s. heap data? As for expensive clone, I wonder if it would work better if we let users put the reference counted wrapper over the whole DataType enum variable wherever they need to avoid the full clone. |
That is a good point. We could make our arrays have an |
One small other optimization is s
|
I like that idea better. Then the rest of the arms are still copy. |
Good call, I agree adding Arc on individual arm would be more efficient than |
what do you both mean with "individual arm"? |
enum DataType {
Int8,
Int16
...
Timestamp(Arc<(TimeUnit, Option<Box<String>>)>),
Extension(Arc<(Box<String>, Box<DataType>, Option<Box<String>>)>),
} |
FYI I've started digging into this. I'm still not entirely sure which route I'm going to take (wrapping the entire type, or individual arms, or defining a new wrapper type, or something else entirely...). Anyways, just a heads up just in case 🙃 |
yields
My initial attempt for
DataType
that boxes allString
andVec
below (50% reduction to 32 bytes).It makes it less friendly to use, but maybe the solution is to offer
pub fn DataType::timestamp(TimeUnit, Option<String>) -> DataType
that boxes the timezone, to make it easier to use (and equivalent to the other types)?Another thing to consider is that
cloning
aDataType
is currently expensive due to theString
,Vec
andBox<Field>
. An alternative is toArc
them, like we do for arrays.The text was updated successfully, but these errors were encountered: