Skip to content

Commit c531219

Browse files
committed
Updated Documentation.
1 parent e14987f commit c531219

File tree

1 file changed

+354
-3
lines changed

1 file changed

+354
-3
lines changed

README.md

Lines changed: 354 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,7 @@
44
pip install kaggle-environments
55
```
66

7-
**BETA RELEASE** - Breaking changes may be introduced!
8-
9-
## TLDR;
7+
# TLDR;
108

119
```python
1210
from kaggle_environments import make
@@ -24,3 +22,356 @@ env.run([my_agent, "random"])
2422
# Render an html ipython replay of the tictactoe game.
2523
env.render(mode="ipython")
2624
```
25+
26+
# Overview
27+
28+
Kaggle Environments was created to evaluate episodes. While other libraries have set interface precedents (such as Open.ai Gym), the emphasis of this library focuses on:
29+
30+
1. Episode evaluation (compared to training agents).
31+
2. Configurable environment/agent lifecycles.
32+
3. Simplified agent and environment creation.
33+
4. Cross language compatible/transpilable syntax/interfaces.
34+
35+
## Help Documentation
36+
37+
```python
38+
# Additional documentation (especially interfaces) can be found on all public functions:
39+
from kaggle_environments import make
40+
help(make)
41+
env = make("tictactoe")
42+
dir(env)
43+
help(env.reset)
44+
```
45+
46+
# Agents
47+
48+
> A function which given an observation generates an action.
49+
50+
## Writing
51+
52+
Agent functions can have observation and configuration parameters and must return a valid action. Details about the observation, configuration, and actions can seen by viewing the specification.
53+
54+
```python
55+
from kaggle_simulations import make
56+
env = make("connectx", {rows: 10, columns: 8, inarow: 5})
57+
58+
def agent(observation, configration):
59+
print(observation) # {board: [...], mark: 1}
60+
print(configuration) # {rows: 10, columns: 8, inarow: 5}
61+
return 3 # Action: always place a mark in the 3rd column.
62+
63+
# Run an episode using the agent above vs the default random agent.
64+
env.run([agent, "random"])
65+
66+
# Print schemas from the specification.
67+
print(env.specification.observation)
68+
print(env.specification.configuration)
69+
print(env.specification.action)
70+
```
71+
72+
## Loading Agents
73+
74+
Agents are always functions, however there are some shorthand syntax options to make generating/using them easier.
75+
76+
```python
77+
# Agent def accepting an observation and returning an action.
78+
def agent1(obs):
79+
return [c for c in range(len(obs.board)) if obs.board[c] == 0][0]
80+
81+
# Load a default agent called "random".
82+
agent2 = "random"
83+
84+
# Load an agent from source.
85+
agent3 = """
86+
def act(obs):
87+
return [c for c in range(len(obs.board)) if obs.board[c] == 0][0]
88+
"""
89+
90+
# Load an agent from a file.
91+
agent4 = "C:\path\file.py"
92+
93+
# Return a fixed action.
94+
agent5 = 3
95+
96+
# Return an action from a url.
97+
agent6 = "http://localhost:8000/run/agent"
98+
```
99+
100+
## Default Agents
101+
102+
Most environments contain default agents to play against. To see the list of available agents for a specific environment run:
103+
104+
```python
105+
from kaggle_simulations import make
106+
env = make("tictactoe")
107+
108+
# The list of available default agents.
109+
print(*env.agents)
110+
111+
# Run random agent vs reaction agent.
112+
env.run(["random", "reaction"])
113+
```
114+
115+
## Training
116+
117+
Open AI Gym interface is used to assist with training agents. The `None` keyword is used below to denote which agent to train (i.e. train as first or second player of connectx).
118+
119+
```python
120+
from kaggle_environments import make
121+
122+
env = make("connectx", debug=True)
123+
124+
# Training agent in first position (player 1) against the default random agent.
125+
trainer = env.train([None, "random"])
126+
127+
obs = trainer.reset()
128+
for _ in range(100):
129+
env.render()
130+
action = 0 # Action for the agent being trained.
131+
obs, reward, done, info = trainer.step(action)
132+
if done:
133+
obs = trainer.reset()
134+
```
135+
136+
## Debugging
137+
138+
There are 3 types of errors which can occur from agent execution:
139+
140+
1. **Timeout** - the agent runtime exceeded the allowed limit. There are 2 timeouts:
141+
1. `agentTimeout` - Used for initialization of an agent on first "act".
142+
2. `actTimeout` - Used for obtaining an action.
143+
2. **Error** - the agent raised and error during execution.
144+
3. **Invalid** - the agent action response didn't match the action specification or the environment deemed it invalid (i.e. playing twice in the same cell in tictactoe).
145+
146+
To help debug your agent and why it threw the errors above, add the `debug` flag when setting up the environment.
147+
148+
```python
149+
from kaggle_simulations import make
150+
151+
def agent():
152+
return "Something Bad"
153+
154+
env = make("tictactoe", debug=True)
155+
156+
env.run([agent, "random"])
157+
# Prints: "Invalid Action: Something Bad"
158+
```
159+
160+
# Environments
161+
162+
> A function which given a state and agent actions generates a new state.
163+
164+
| Name | Description | Make |
165+
| --------- | ------------------------------------ | ------------------------- |
166+
| connectx | Connect 4 in a row but configurable. | `env = make("connectx")` |
167+
| tictactoe | Classic Tic Tac Toe | `env = make("tictactoe")` |
168+
| identity | For debugging, action is the reward. | `env = make("identity")` |
169+
170+
## Making
171+
172+
An environment instance can be made from an existing specification (such as those listed above).
173+
174+
```python
175+
from kaggle_environments import make
176+
177+
# Create an environment instance.
178+
env = make(
179+
# Specification or name to registered specification.
180+
"connectx",
181+
182+
# Override default and environment configuration.
183+
configuration={"rows": 9, "columns": 10},
184+
185+
# Initialize the environment from a prior state (episode resume).
186+
steps=[],
187+
188+
# Enable verbose logging.
189+
debug=True
190+
)
191+
```
192+
193+
## Configuration
194+
195+
There are two types of configuration: Defaults applying to every environment and those specific to the environment. The following is a list of the default configuration:
196+
197+
| Name | Description |
198+
| ------------ | --------------------------------------------------------------- |
199+
| episodeSteps | Maximum number of steps in the episode. |
200+
| agentExec | How the agent is executed alongside the envionment. |
201+
| agentTimeout | Maximum runtime (seconds) to initialize an agent. |
202+
| actTimeout | Maximum runtime (seconds) to obtain an action from an agent. |
203+
| runTimeout | Maximum runtime (seconds) of an episode (not necessarily DONE). |
204+
205+
```python
206+
env = make("connectx", configuration={
207+
"columns": 19, # Specific to ConnectX.
208+
"actTimeout": 10,
209+
"agentExec": "LOCAL"
210+
})
211+
```
212+
213+
## Resetting
214+
215+
Environments are reset by default after "make" (unless starting steps are passed in) as well as when calling "run". Reset can be called at anytime to clear the environment.
216+
217+
```python
218+
num_agents = 2
219+
reset_state = env.reset(num_agents)
220+
```
221+
222+
## Running
223+
224+
Execute an episode against the environment using the passed in agents until they are no longer running (i.e. status != ACTIVE).
225+
226+
```python
227+
steps = env.run([agent1, agent2])
228+
print(steps)
229+
```
230+
231+
## Evaluating
232+
233+
Evaluation is used to run an episode (environment + agents) multiple times and just return the rewards.
234+
235+
```python
236+
from kaggle_simulations import evaluate
237+
238+
# Same definitions as "make" above.
239+
envrionment = "connectx"
240+
configuration = {rows: 10, columns: 8, inarow: 5}
241+
steps = []
242+
debug = False
243+
244+
# Which agents to run repeatedly. Same as env.run(agents)
245+
agents = ["random", agent1]
246+
247+
# How many times to run them.
248+
num_episodes = 10
249+
250+
rewards = evaluate(environment, agents, configuration, steps, num_episodes, debug)
251+
```
252+
253+
## Stepping
254+
255+
Running above essentially just steps until no agent is still active. To execute a singular game loop, pass in actions directly for each agent. Note that this is normally used for training agents (most useful in a single agent setup such as using the gym interface).
256+
257+
```python
258+
agent1_action = agent1(env.state[0].observation)
259+
agent2_action = agent2(env.state[1].observation)
260+
state = env.step(agent1_action, agent2_action)
261+
```
262+
263+
## Playing
264+
265+
A few environments offer an interactive play against agents within jupyter notebooks. An example of this is using connectx:
266+
267+
```python
268+
from kaggle_simulations import make
269+
270+
env = make("connectx")
271+
# None indicates which agent will be manually played.
272+
env.play([None, "random"])
273+
```
274+
275+
## Rendering
276+
277+
The following rendering modes are supported:
278+
279+
- json - Same as doing a json dump of `env.toJSON()`
280+
- ansi - Ascii character representation of the environment.
281+
- human - ansi just printed to stdout
282+
- html - HTML player representation of the environment.
283+
- ipython - html just printed to the output of a ipython notebook.
284+
285+
```python
286+
out = env.render(mode="ansi")
287+
print(out)
288+
```
289+
290+
# Command Line
291+
292+
```sh
293+
> python main.py -h
294+
```
295+
296+
## List Registered Environments
297+
298+
```sh
299+
> python main.py list
300+
```
301+
302+
## Evaluate Episode Rewards
303+
304+
```sh
305+
python main.py evaluate --environment tictactoe --agents random random --episodes 10
306+
```
307+
308+
## Run an Episode
309+
310+
```sh
311+
> python main.py run --environment tictactoe --agents random /pathtomy/agent.py --debug True
312+
```
313+
314+
## Load an Episode
315+
316+
This is useful when converting an episode json output into html.
317+
318+
```sh
319+
python main.py load --environment tictactoe --steps [...] --render '{"mode": "html"}'
320+
```
321+
322+
# HTTP Server
323+
324+
The HTTP server contains the same interface/actions as the CLI above merging both POST body and GET params.
325+
326+
## Setup
327+
328+
```bash
329+
python main.py http-server --port=8012 --host=0.0.0.0
330+
```
331+
332+
## Adding Middleware
333+
334+
```python
335+
# middleware.py
336+
import time
337+
338+
def request(req):
339+
time.sleep(30)
340+
req.agents = ["random", "random"]
341+
return req
342+
343+
def response(req, resp):
344+
time.sleep(10)
345+
return resp
346+
```
347+
348+
```bash
349+
python3 main.py http-server --middleware=/path/to/middleware.py
350+
```
351+
352+
### Running Agents on Separate Servers
353+
354+
```python
355+
# How to run agent on a separate server.
356+
import requests
357+
import json
358+
359+
path_to_agent1 = "/home/ajeffries/git/playground/agent1.py"
360+
path_to_agent2 = "/home/ajeffries/git/playground/agent2.py"
361+
362+
agent1_url = f"http://localhost:5001?agents[]={path_to_agent1}"
363+
agent2_url = f"http://localhost:5002?agents[]={path_to_agent2}"
364+
365+
body = {
366+
"action": "run",
367+
"environment": "tictactoe",
368+
"agents": [agent1_url, agent2_url]
369+
}
370+
resp = requests.post(url="http://localhost:5000", data=json.dumps(body)).json()
371+
372+
# Inflate the response replay to visualize.
373+
from kaggle_environments import make
374+
env = make("tictactoe", steps=resp["steps"], debug=True)
375+
env.render(mode="ipython")
376+
print(resp)
377+
```

0 commit comments

Comments
 (0)