Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .secrets.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@
"filename": "README.md",
"hashed_secret": "a8253456364f1bfc7da7ae4a1db5b45d106317a5",
"is_verified": false,
"line_number": 454
"line_number": 514
}
],
"SLURM.md": [
Expand Down Expand Up @@ -561,5 +561,5 @@
}
]
},
"generated_at": "2026-03-02T22:46:56Z"
"generated_at": "2026-03-11T21:29:07Z"
}
70 changes: 65 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -376,17 +376,27 @@ python gsm8k_server.py evaluate \
Run the following commands in **separate terminals**, in this order:

**Terminal 1** — Start the API server first (must be running before environments connect):
```sh
```bash
run-api
```

**Terminal 2** — Start an environment:
```sh
python gsm8k_server.py serve --slurm False # or an env of your choice
```bash
python environments/gsm8k_server.py serve --slurm False # or an env of your choice
```

**Terminal 3** — (Optional) Dry-run your configuration:

```bash
atropos-sft-gen path/to/output.jsonl \
--tokenizer Qwen/Qwen2.5-1.5B-Instruct \
--dry-run
```

If this succeeds, your tokenizer and rollout server connectivity are correctly configured.

**Terminal 3** — Generate data:
```sh
```bash
atropos-sft-gen path/to/output.jsonl --tokenizer Qwen/Qwen2.5-1.5B-Instruct # or whichever tokenizer you have in your env config
```
Rejection sampling can be controlled via `--save-top-n-per-group`, `--allow-negative-scores`, and `--minimum-score-diff-max-min`. See `atropos-sft-gen -h` for more detailed usage info.
Expand Down Expand Up @@ -442,10 +452,60 @@ Ensure you're using a clean virtual environment with the correct Python version:

```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
source .venv/bin/activate # On Windows (PowerShell): .venv\Scripts\Activate.ps1
pip install -e ".[dev]"
```

### Windows Quickstart

While Atropos is primarily documented with Unix-like shells in mind, it works well on Windows too.
Below is a minimal end-to-end example using **PowerShell**.

1. Create and activate a virtual environment:

```powershell
cd C:\path\to\atropos
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install -e .[dev]
```

2. Start the API server:

```powershell
run-api
```

3. In a new PowerShell window, start an environment (for example GSM8K):

```powershell
cd C:\path\to\atropos
.venv\Scripts\Activate.ps1
python .\environments\gsm8k_server.py serve --slurm False
```

4. In a third PowerShell window, dry-run your offline data generation setup, then generate data:

```powershell
cd C:\path\to\atropos
.venv\Scripts\Activate.ps1

# Optional: configuration check
atropos-sft-gen .\gsm8k_rollouts.jsonl `
--tokenizer Qwen/Qwen2.5-1.5B-Instruct `
--dry-run

# Actual data generation
atropos-sft-gen .\gsm8k_rollouts.jsonl `
--tokenizer Qwen/Qwen2.5-1.5B-Instruct
```

If you see connectivity errors in dry-run, double-check that:

- `run-api` is running and listening on the expected port (default `http://localhost:8000`)
- Your environment script (e.g. `gsm8k_server.py`) is running without errors
- Any firewall or VPN software is not blocking local HTTP requests

**`OPENAI_API_KEY` not set errors**

Set your API key as an environment variable, or configure it in the environment's `config_init`:
Expand Down