In this study, we address the challenge of enabling an artificial intelligence agent to execute complex language instructions within virtual environments. Our framework assumes that these instructions involve intricate linguistic structures and multiple interdependent tasks that must be navigated successfully to achieve the desired outcomes. To manage these complexities effectively, we propose a hierarchical framework that combines the deep language comprehension of large language models (LLMs) with the adaptive action-execution capabilities of reinforcement learning (RL) agents. The language module (based on LLM) translates the language instruction into a high-level action plan, which is then executed by a pre-trained RL agent. We have demonstrated the effectiveness of our approach in two different environments: IGLU, where agents are instructed to build structures, and Crafter, where agents perform tasks and interact with objects in the surrounding environment according to language commands.
Paper: Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments
-
Create the environment:
conda create --name Igor python=3.9
-
Activate the environment:
conda activate Igor
-
Install the required packages:
pip install -r docker/requirements.txt
- Create the container:
cd ./docker
sh build.sh
cd ../
- Run container:
docker run --shm-size 20G --env WANDB_API_KEY=$WANDB_API_KEY --rm -it -v $(pwd):/code -w /code --gpus all igor bash
- Input:
<Architect> Make 3 red blocks in the middle of the grid
- Output:
[0, 5, 5], [1, 1, 3], skyeast, red
- Input:
Vanquish the undead foe, gather a single unit of metallic mineral, and forge an iron weapon
- Output:
['Defeat Zombie', 'Collect Iron with count 1']
To generate the Crafter dataset, run:
python3 scripts/crafter_dataset_generator.py
-
The datasets for the Crafter environment can be found at
./datasets/crafter
. -
The datasets for the IGLU environment, including its augmented and primitive versions, can be found at
./datasets/iglu
.
To run LLM tuning in the Crafter dataset, execute:
sh scripts/crafter/train_llm.sh
To run LLM tuning in the IGLU original dataset, execute:
sh scripts/iglu/train_llm.sh
To run LLM tuning in IGLU with subtasks as primitives, execute:
sh scripts/iglu/train_llm_prim.sh
To run RL tuning in the Crafter dataset, execute:
sh scripts/crafter/train_rl.sh
Citation:
@article{volovikova2024instruction,
title={Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments},
author={Zoya Volovikova and Alexey Skrynnik and Petr Kuderov and Aleksandr I. Panov},
year={2024},
eprint={2407.09287},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2407.09287},
}