List models from an OpenAI-compatible endpoint (e.g. GET …/v1/models), choose five models and a task difficulty, then compare runs. Only the chat model name changes between episodes; prompts and environment settings are identical.
Default API root matches Ollama’s OpenAI-compatible surface ( ollama.com/v1/models). For a local daemon use http://127.0.0.1:11434/v1.
| Model | Total reward | Steps | Error |
|---|
Per-episode reward sequence (same task + seed per model).