This repository contains our implementation of PPO and TRPO, with manual toggles for the code-level optimizations described in our paper. We assume that the user has a machine with MuJoCo and mujoco_py properly set up and installed, i.e. you should be able to run the following command on your system without errors:
import gym
gym.make_env("Humanoid-v2")The code itself is quite simple to use. To run the ablation case study discussed in our paper, you can run the following list of commands:
cd configs/mkdir PATH_TO_OUT_DIRand changeout_dirto this in the relevant config file. By default agents will be written toresults/{env}_{algorithm}/agents/.python {config_name}.pycd ..- Edit the
NUM_THREADSvariables in therun_agents.pyfile according to your local machine. - Train the agents:
python run_agents.py PATH_TO_OUT_DIR/agent_configs - The outputs will be in the
agentssubdirectory ofOUT_DIR, readable with thecoxpython library.
See the MuJoCo.json file for a full list of adjustable parameters.