Directory structures:

experimental-code: Draft code, not really used

rl-starter-files: Cloned from the original repository, and I modified to include a GPT interface. I also wrote a GPT-based reward shaping function to ask GPT about the reward.

Running the code:

Setting up the repository: After creating your conda environment:

cd rl-starter-files
pip install -r requirements.txt

Right now my training script:

cd rl-starter-files/

python -m scripts.train --algo ppo --env BabyAI-GoToImpUnlock-v0 --model GoToImpUnlock0.0005Ask --text --save-interval 10 --frames 250000 --gpt

The problem is, an ask probability of 0.0005 is still very bad...It takes a really long time to train.

TODO

Baselines

Basic:

PPO, A2C only

Exploration(?):

RND: https://opendilab.github.io/DI-engine/12_policies/rnd.html BeBold, NovelD: https://github.com/tianjunz/NovelD Deir

Update

Bash script of experiments of different babyai and minigrid environments can be found as babyai.sh and minigrid.sh.
The reshaped reward with gpt predicting for a single action and for the next few actions (currently hardcoded as 10) are implemented and merged in the train.py and the utils folder.
Add eval2excel.py for evaluation and convert the results to excel files.

To run:

/data1/lzengaf/cs285/proj/minigrid/experimental-code/llm-interface/llama2_interface.py

first run:

pip install langchain cmake
export CMAKE_ARGS="-DLLAMA_METAL=on"
FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir

curl https://ollama.ai/install.sh | sh

Name		Name	Last commit message	Last commit date
Latest commit History 220 Commits
FastChat		FastChat
experimental-code		experimental-code
llama		llama
milestone-report		milestone-report
rl-starter-files		rl-starter-files
scripts		scripts
torch-ac		torch-ac
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
Report.pdf		Report.pdf
env.yml		env.yml
env_savio.yml		env_savio.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Directory structures:

Running the code:

TODO

Baselines

Update

About

Uh oh!

Releases

Packages

Languages

colinc9/minigrid

Folders and files

Latest commit

History

Repository files navigation

Directory structures:

Running the code:

TODO

Baselines

Update

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages