-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Sorry for being nosy - I am curious about all the columnar technologies in an actual analysis context like Hgg :)
I saw that pickle is mentioned/used in several places. You might try parquet instead (df.to_parquet(), pd.read_parquet()). It's essentially the industry version of ROOT, so it'll be much faster at serializing/deserializing than pickle. Pickle is probably faster without compression, but if you use df.to_pickle("blah.pkl.gz"), parquet will be faster. Of course this all depends on how big your files are.
Metadata
Metadata
Assignees
Labels
No labels