Director–Generator MAS for image prompt iteration. Uses LiteLLM for chat models and an SDXL stub for image generation. Defaults to a local vLLM server running Qwen2.5-VL-7B-Instruct, but a ChatGPT preset is available via config/system_config_openai.yaml.
- Python 3.13 (conda env recommended)
pip install uv; uv pip install -r requirements.txt- Local OpenAI-compatible endpoint (vLLM) at
http://127.0.0.1:8000/v1
- Export env (dummy key is fine for local vLLM), or set
.envby copying and modifying given.env.example:export OPENAI_API_KEY=EMPTY export VLLM_API_BASE=http://127.0.0.1:8000/v1
- Run vLLM:
Wait for the server to print the model ID (
python -m vllm.entrypoints.openai.api_server \ --model qwen/Qwen2.5-VL-7B-Instruct \ --host 0.0.0.0 --port 8000
qwen/Qwen2.5-VL-7B-Instruct). - Execute a run:
python -m src.main "sample prompt"- Use a different config with
--config config/system_config_openai.yamlto hit the ChatGPT API (requiresOPENAI_API_KEY). - Progress logs print each phase and LLM call (system/user previews and replies).
- Output JSON and mock images land in
output/.
- Use a different config with
config/system_config.yamlcontrols model aliases, vLLM endpoint, SDXL mode, loop limits, and recursion limit (defaults:max_loops=10,recursion_limit=1000, SDXLmode=mock).config/directors.jsonandconfig/generators.jsondefine agent personas; prompts live underprompts/.- For SDXL:
sdxl.modecan bemock(placeholder PNGs),api(Stability API; requiresSTABILITY_API_KEY), orlocal(loads the SDXL base pipeline tosdxl.device). - SDXL failover: set
sdxl.failover_mode: mock(or envSDXL_FAILOVER_MODE=mock) to fall back to mock images if the SDXL server returns an error. - CLI accepts
--configto point at an alternate system config and--max-iteration/--max-iterationsto override loop count.
- Set
OPENAI_API_KEYin your environment (noVLLM_API_BASEneeded). - Use
config/system_config_openai.yamlto map roles togpt-4o/gpt-4o-mini. - Example with a prompt file:
python -m src.main -f audio2txt/processed/0000_45-55.json \ --max-iterations 5 \ --output-dir output_chatgpt/test_0000_45-55 \ --config config/system_config_openai.yaml
- Start one server per GPU; the pipeline loads once and stays hot:
CUDA_VISIBLE_DEVICES=0 python -m scripts.sdxl_server --port 7000 --device cuda:0 CUDA_VISIBLE_DEVICES=1 python -m scripts.sdxl_server --port 7001 --device cuda:0
- Point clients with
SDXL_API_BASE=http://127.0.0.1:7000(env overrides config and routes SDXL calls to the server). Setsdxl.mode: apiin your config if you prefer an explicit toggle. - To keep jobs alive even if the server hiccups, add
sdxl.failover_mode: mockin the config or exportSDXL_FAILOVER_MODE=mockso failed requests fall back to mock images instead of crashing the run. - Parallel example that shards across two SDXL servers:
export GPUS=\"0 1\" find audio2txt/processed -maxdepth 1 -name '*.json' -print0 | parallel -0 --jobs 4 --lb --env GPUS ' file=\"{}\" gpus=($GPUS); n=${#gpus[@]} slot={#}; idx=$(( (slot-1) % n )) sdxl_port=$((7000 + idx)) base=$(basename \"$file\" .json) out_dir=\"output_chatgpt/output_$(date +%Y%m%d_%H%M%S_%N)_${base}\" mkdir -p \"$out_dir\" SDXL_API_BASE=http://127.0.0.1:${sdxl_port} \ python -m src.main -f \"$file\" --max-iterations 5 --output-dir \"$out_dir\" \ --config config/system_config_openai.yaml \ > \"$out_dir/run.log\" 2>&1 '
- Health check:
curl http://127.0.0.1:7000/docsto confirm the server is up.
- Local vLLM must advertise a model id matching the alias in
config/system_config.yaml(case-sensitive). - Recursion errors: increase
recursion_limitinconfig/system_config.yamlor adjustevaluation_threshold/max_loops.