You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for creating this awesome project! I am excited to use it as a plugin for Kedro pipeline-parameter sweeps (e.g. via Hydra or Optuna).
I was interested in this point in the README:
you can run the same session multiple times with many speed optimisation (including dataset caching)
but I couldn't find any information about it in the code-base. Is it implemented? If so, is the dataset cached to disk across session runs, or is it just kedro.io.CachedDataSet under the hood?
The text was updated successfully, but these errors were encountered:
Yes the cached dataset is persisted across runs. kedro-boot cache/preload some datastes as MemoryDataset in order to speedup the runs and achieve low latency. The process of preparing the catalog for multiple runs process is called catalog compilation. You can dry run the compilation process with kedro boot compile --pipeline your_pipeline, the list of artifact datasets that would be cached are described in the compilation report.
In your use case, you would have a thin application that inject some parameters into your pipelines, kedro-boot would preload all other datasets as MemoryDataset as they are not changed between runs.
Thanks for creating this awesome project! I am excited to use it as a plugin for Kedro pipeline-parameter sweeps (e.g. via Hydra or Optuna).
I was interested in this point in the README:
but I couldn't find any information about it in the code-base. Is it implemented? If so, is the dataset cached to disk across session runs, or is it just kedro.io.CachedDataSet under the hood?
The text was updated successfully, but these errors were encountered: