You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+354-3Lines changed: 354 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,9 +4,7 @@
4
4
pip install kaggle-environments
5
5
```
6
6
7
-
**BETA RELEASE** - Breaking changes may be introduced!
8
-
9
-
## TLDR;
7
+
# TLDR;
10
8
11
9
```python
12
10
from kaggle_environments import make
@@ -24,3 +22,356 @@ env.run([my_agent, "random"])
24
22
# Render an html ipython replay of the tictactoe game.
25
23
env.render(mode="ipython")
26
24
```
25
+
26
+
# Overview
27
+
28
+
Kaggle Environments was created to evaluate episodes. While other libraries have set interface precedents (such as Open.ai Gym), the emphasis of this library focuses on:
29
+
30
+
1. Episode evaluation (compared to training agents).
31
+
2. Configurable environment/agent lifecycles.
32
+
3. Simplified agent and environment creation.
33
+
4. Cross language compatible/transpilable syntax/interfaces.
34
+
35
+
## Help Documentation
36
+
37
+
```python
38
+
# Additional documentation (especially interfaces) can be found on all public functions:
39
+
from kaggle_environments import make
40
+
help(make)
41
+
env = make("tictactoe")
42
+
dir(env)
43
+
help(env.reset)
44
+
```
45
+
46
+
# Agents
47
+
48
+
> A function which given an observation generates an action.
49
+
50
+
## Writing
51
+
52
+
Agent functions can have observation and configuration parameters and must return a valid action. Details about the observation, configuration, and actions can seen by viewing the specification.
return3# Action: always place a mark in the 3rd column.
62
+
63
+
# Run an episode using the agent above vs the default random agent.
64
+
env.run([agent, "random"])
65
+
66
+
# Print schemas from the specification.
67
+
print(env.specification.observation)
68
+
print(env.specification.configuration)
69
+
print(env.specification.action)
70
+
```
71
+
72
+
## Loading Agents
73
+
74
+
Agents are always functions, however there are some shorthand syntax options to make generating/using them easier.
75
+
76
+
```python
77
+
# Agent def accepting an observation and returning an action.
78
+
defagent1(obs):
79
+
return [c for c inrange(len(obs.board)) if obs.board[c] ==0][0]
80
+
81
+
# Load a default agent called "random".
82
+
agent2 ="random"
83
+
84
+
# Load an agent from source.
85
+
agent3 ="""
86
+
def act(obs):
87
+
return [c for c in range(len(obs.board)) if obs.board[c] == 0][0]
88
+
"""
89
+
90
+
# Load an agent from a file.
91
+
agent4 ="C:\path\file.py"
92
+
93
+
# Return a fixed action.
94
+
agent5 =3
95
+
96
+
# Return an action from a url.
97
+
agent6 ="http://localhost:8000/run/agent"
98
+
```
99
+
100
+
## Default Agents
101
+
102
+
Most environments contain default agents to play against. To see the list of available agents for a specific environment run:
103
+
104
+
```python
105
+
from kaggle_simulations import make
106
+
env = make("tictactoe")
107
+
108
+
# The list of available default agents.
109
+
print(*env.agents)
110
+
111
+
# Run random agent vs reaction agent.
112
+
env.run(["random", "reaction"])
113
+
```
114
+
115
+
## Training
116
+
117
+
Open AI Gym interface is used to assist with training agents. The `None` keyword is used below to denote which agent to train (i.e. train as first or second player of connectx).
118
+
119
+
```python
120
+
from kaggle_environments import make
121
+
122
+
env = make("connectx", debug=True)
123
+
124
+
# Training agent in first position (player 1) against the default random agent.
125
+
trainer = env.train([None, "random"])
126
+
127
+
obs = trainer.reset()
128
+
for _ inrange(100):
129
+
env.render()
130
+
action =0# Action for the agent being trained.
131
+
obs, reward, done, info = trainer.step(action)
132
+
if done:
133
+
obs = trainer.reset()
134
+
```
135
+
136
+
## Debugging
137
+
138
+
There are 3 types of errors which can occur from agent execution:
139
+
140
+
1.**Timeout** - the agent runtime exceeded the allowed limit. There are 2 timeouts:
141
+
1.`agentTimeout` - Used for initialization of an agent on first "act".
142
+
2.`actTimeout` - Used for obtaining an action.
143
+
2.**Error** - the agent raised and error during execution.
144
+
3.**Invalid** - the agent action response didn't match the action specification or the environment deemed it invalid (i.e. playing twice in the same cell in tictactoe).
145
+
146
+
To help debug your agent and why it threw the errors above, add the `debug` flag when setting up the environment.
147
+
148
+
```python
149
+
from kaggle_simulations import make
150
+
151
+
defagent():
152
+
return"Something Bad"
153
+
154
+
env = make("tictactoe", debug=True)
155
+
156
+
env.run([agent, "random"])
157
+
# Prints: "Invalid Action: Something Bad"
158
+
```
159
+
160
+
# Environments
161
+
162
+
> A function which given a state and agent actions generates a new state.
| connectx | Connect 4 in a row but configurable. |`env = make("connectx")`|
167
+
| tictactoe | Classic Tic Tac Toe |`env = make("tictactoe")`|
168
+
| identity | For debugging, action is the reward. |`env = make("identity")`|
169
+
170
+
## Making
171
+
172
+
An environment instance can be made from an existing specification (such as those listed above).
173
+
174
+
```python
175
+
from kaggle_environments import make
176
+
177
+
# Create an environment instance.
178
+
env = make(
179
+
# Specification or name to registered specification.
180
+
"connectx",
181
+
182
+
# Override default and environment configuration.
183
+
configuration={"rows": 9, "columns": 10},
184
+
185
+
# Initialize the environment from a prior state (episode resume).
186
+
steps=[],
187
+
188
+
# Enable verbose logging.
189
+
debug=True
190
+
)
191
+
```
192
+
193
+
## Configuration
194
+
195
+
There are two types of configuration: Defaults applying to every environment and those specific to the environment. The following is a list of the default configuration:
| episodeSteps | Maximum number of steps in the episode. |
200
+
| agentExec | How the agent is executed alongside the envionment. |
201
+
| agentTimeout | Maximum runtime (seconds) to initialize an agent. |
202
+
| actTimeout | Maximum runtime (seconds) to obtain an action from an agent. |
203
+
| runTimeout | Maximum runtime (seconds) of an episode (not necessarily DONE). |
204
+
205
+
```python
206
+
env = make("connectx", configuration={
207
+
"columns": 19, # Specific to ConnectX.
208
+
"actTimeout": 10,
209
+
"agentExec": "LOCAL"
210
+
})
211
+
```
212
+
213
+
## Resetting
214
+
215
+
Environments are reset by default after "make" (unless starting steps are passed in) as well as when calling "run". Reset can be called at anytime to clear the environment.
216
+
217
+
```python
218
+
num_agents =2
219
+
reset_state = env.reset(num_agents)
220
+
```
221
+
222
+
## Running
223
+
224
+
Execute an episode against the environment using the passed in agents until they are no longer running (i.e. status != ACTIVE).
225
+
226
+
```python
227
+
steps = env.run([agent1, agent2])
228
+
print(steps)
229
+
```
230
+
231
+
## Evaluating
232
+
233
+
Evaluation is used to run an episode (environment + agents) multiple times and just return the rewards.
234
+
235
+
```python
236
+
from kaggle_simulations import evaluate
237
+
238
+
# Same definitions as "make" above.
239
+
envrionment ="connectx"
240
+
configuration = {rows: 10, columns: 8, inarow: 5}
241
+
steps = []
242
+
debug =False
243
+
244
+
# Which agents to run repeatedly. Same as env.run(agents)
Running above essentially just steps until no agent is still active. To execute a singular game loop, pass in actions directly for each agent. Note that this is normally used for training agents (most useful in a single agent setup such as using the gym interface).
256
+
257
+
```python
258
+
agent1_action = agent1(env.state[0].observation)
259
+
agent2_action = agent2(env.state[1].observation)
260
+
state = env.step(agent1_action, agent2_action)
261
+
```
262
+
263
+
## Playing
264
+
265
+
A few environments offer an interactive play against agents within jupyter notebooks. An example of this is using connectx:
266
+
267
+
```python
268
+
from kaggle_simulations import make
269
+
270
+
env = make("connectx")
271
+
# None indicates which agent will be manually played.
272
+
env.play([None, "random"])
273
+
```
274
+
275
+
## Rendering
276
+
277
+
The following rendering modes are supported:
278
+
279
+
- json - Same as doing a json dump of `env.toJSON()`
280
+
- ansi - Ascii character representation of the environment.
281
+
- human - ansi just printed to stdout
282
+
- html - HTML player representation of the environment.
283
+
- ipython - html just printed to the output of a ipython notebook.
284
+
285
+
```python
286
+
out = env.render(mode="ansi")
287
+
print(out)
288
+
```
289
+
290
+
# Command Line
291
+
292
+
```sh
293
+
> python main.py -h
294
+
```
295
+
296
+
## List Registered Environments
297
+
298
+
```sh
299
+
> python main.py list
300
+
```
301
+
302
+
## Evaluate Episode Rewards
303
+
304
+
```sh
305
+
python main.py evaluate --environment tictactoe --agents random random --episodes 10
306
+
```
307
+
308
+
## Run an Episode
309
+
310
+
```sh
311
+
> python main.py run --environment tictactoe --agents random /pathtomy/agent.py --debug True
312
+
```
313
+
314
+
## Load an Episode
315
+
316
+
This is useful when converting an episode json output into html.
0 commit comments