Reuven Cohen put it plainly: the going rate for a single developer running Claude Code in swarm mode is ~$2,500/day. Not because the models are expensive — because the same context keeps getting resent, and because every task, regardless of complexity, gets routed to the heaviest model available.

Fixing a typo in an error message does not require Opus. Renaming a variable does not require Opus. Adding a log line does not require Opus. But if your tooling doesn't know that, it will happily bill you as if it does.

The routing table

Agent Booster's new route_model tool applies three signals to decide which model tier to recommend:

Signal	Model
Complexity keyword: refactor, architect, design, migrate, security…	Opus
Task touches 5+ distinct files	Opus
Task touches 2–4 files	Sonnet
1 file, fewer than 3 matching symbols	Haiku
Everything else	Sonnet (default)

How it works

When you call route_model(task), Booster first checks the task description for complexity keywords. If none match, it runs a vector search across your indexed codebase to count how many distinct files are semantically relevant. Narrow task, few files, few symbols → Haiku. Broad cross-cutting task → Opus.

The result is a JSON object the calling agent can act on immediately:

# narrow task

{ "model": "haiku", "reason": "narrow task — 1 symbol in 1 file" }

# architecture change

{ "model": "opus", "reason": "complexity keyword: refactor" }

Using it from the CLI

You don't need to be inside an MCP session to use it. The CLI command works anywhere in your project:

# returns the recommended tier before you start

booster route "fix typo in error message"

haiku (narrow task — 1 symbol in 1 file)

booster route "refactor auth middleware to support OAuth2"

opus (complexity keyword: refactor)

Where this fits in the stack

Agent Booster operates at three compounding layers:

3AST + semantic routing — smart_read, search_context, get_symbols, route_model. Reduces what the model sees per call.
2RTK — strips noise from CLI, git, build, and test output before it enters the context window.
1Prompt caching — stable prefixes cached at 90% discount. Native to Claude Code and the Anthropic API.

route_model is the newest addition to Layer 3. It doesn't change what context is sent — it changes which model receives it. Combined with smart_read already cutting token volume 3–15x, routing to Haiku on narrow tasks compounds the savings further.

Get it

# install

pip install agent-booster

# index your codebase

booster index

# route a task

booster route "your task here"

The MCP tool is available automatically once you run booster init claude. No config changes needed — it shows up alongside get_symbols, search_context, and smart_read.

Stop paying Opus prices for Haiku work.

The routing table

How it works

Using it from the CLI

Where this fits in the stack

Get it