Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment management features #75

Open
eduble opened this issue Feb 14, 2022 · 0 comments
Open

Experiment management features #75

eduble opened this issue Feb 14, 2022 · 0 comments

Comments

@eduble
Copy link
Contributor

eduble commented Feb 14, 2022

WalT should provide easier and more reproducible means to manage a whole experiment.
This could rely on the following features:

  • An API should be provided (e.g. a python api), allowing to write experiment scripts in an easier manner compared to the current practice (bash script with walt client commands).
  • An experiment could be defined by a walt image deployed on a virtual node. This walt image would be equipped with the API described above, and automated experiment startup tooling. This virtual node would be considered the experiment-manager (just called manager below for simplicity) and would possibly create other virtual nodes, make them boot walt images, retrieve logs, etc. Note: this implies that the API should be made available to manager nodes too (not only to walt users). We (the WALT project) would obviously provide a base image for manager nodes, which would automate a basic experiment. If needed, we may add a docker label to these images to recognize them.
  • This could be partly hidden behind a new category of sub-commands to the walt command: walt experiment <sub-command>. For instance, walt experiment create <exp-name> could clone our base image for manager nodes and name it <exp-name>-manager, create the manager node and let it boot this image. walt experiment shell <exp-name> would be a shortcut to walt image shell <exp-name>-manager. Unless modification is cancelled, the manager node would be rebooted to reflect these changes (this is already the effect of walt image shell, but printed messages might need adaptation). walt experiment run <exp-name> <num-runs> would request the manager node to run the experiment <num-runs> times. And so on. We may need to add checks in other commands to ensure everything is OK (for instance disallow to use walt node boot with a manager image, since WALT manages the manager node itself and must ensure its uniqueness). We may also redirect user to walt experiment <sub-command> in some cases (e.g. when the user tries walt image shell directly on a manager image); or we may hide these special images from other commands; however we would probably need to have the manager nodes displayed and usable by most walt node sub-commands (except boot).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant