-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Currently the way time-filtering is implemented for loading to dask is inefficient - it works by pruning the data manifest, which means that loading to dask/polars requires loading a list of individual parquets, which is slower than loading based on the root prefix of the parquet store and then using dask/polars tooling to subset the data. We should refactor to use the second pattern, which requires storing the time filtering bounds.
Reactions are currently unavailable