-
Notifications
You must be signed in to change notification settings - Fork 44
Add LLM-based policy for MettaGrid agents #4090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
de868b5 to
4a65053
Compare
4a65053 to
ed7c8ab
Compare
| for col in range(obs_width): | ||
| if row == agent_x and col == agent_y: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coordinate system confusion in grid visualization: The loop iterates for row in range(obs_height): for col in range(obs_width): then checks if row == agent_x and col == agent_y:.
Since agent_x = obs_width // 2 (column-based) and agent_y = obs_height // 2 (row-based), comparing row with agent_x is incorrect.
Should be:
if row == agent_y and col == agent_x:Or restructure to use consistent variable names where row_idx and col_idx are used instead of row/col.
| for col in range(obs_width): | |
| if row == agent_x and col == agent_y: | |
| for col in range(obs_width): | |
| if row == agent_y and col == agent_x: |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
Summary
Adds LLM policy support allowing GPT, Claude, or local Ollama models to control MettaGrid agents.
The LLM observes the game state as structured JSON and outputs action decisions.
Features
New Policy Classes:
Dynamic Prompt System:
Observation Debugging:
kw.debug_mode=trueCost Tracking:
Usage
Main Files Changed
Asana Task