Start with a very small dataset e.g. 2 servers and 10 arriving containers
Use the check_env script and try scale the rewards within some range
Train the model using bash script files with gridsearch (it is RL and it's a huge pain to train) and monitor the training process in tensorboard
If some promissing situations in the training then add that to some promissing results for test
For testing use the real arabesque workloads and test on them
GOTO step 1 and make the dataset bigger (this should be subjected to the Arabesque dataset size and GKE cluster node sizes as we will finally deploy on them)
Repeat 1-6 until some good dataset with all the tests in good shape
If all good run tests on K8s

Provide feedback

Saved searches