Open Source Developer Tool

Every token you don't send is a token you don't pay for.

Swarm-style AI development costs up to $2,500/day — not because models are expensive, but because stuck context cycles keep resending the same architecture docs, source files, and conversation history on every call. Agent Booster breaks the cycle.

pip install agent-booster[full]

View on GitHub

Python 3.10+ · MIT licensed · v0.2.19 · runs 100% locally · no code leaves your machine

Claude Code

booster init claude → booster gain — less sent, less spent, same result.

OpenAI Codex

booster init codex → booster start → booster gain.

Inspiration

Standing on the shoulders of sharp thinkers.

Reuven Cohen

Agentic Engineer · Founder @ Cognitum.One

“The going rate for a single developer running Claude Code using a swarm style development is around $2.5k/day or $75k/month via Anthropic enterprise API.”
“The biggest cost in agentic development isn't the model. It's the constant replay of context. Most autonomous coding systems keep resending the same architecture documents, ADRs, source files, tool definitions, and conversation history over and over. That's where the money goes.”
“We're not optimizing models. We're optimizing information flow.”

Cut Claude Code Costs by 3x–15x — infographic by Reuven Cohen

Reuven's post articulated the problem precisely. Agent Booster is our open-source implementation of that insight — AST-level symbol routing, semantic vector search, and MCP integration built directly into your coding workflow. The concept of operating at the AST and semantic level rather than treating code as raw text comes directly from this framing.

The problem

The biggest cost isn't the model. It's context replay.

Most autonomous coding systems keep resending the same files over and over. Agent Booster operates at the AST and semantic level — routing only the pieces that actually matter to the task.

The old way

✕Architecture docs
✕ADRs
✕Source files
✕Tool definitions
✕Repo state
✕Conversation history

~$2,500/day. High token replay.

✦

The Booster way

✓AST nodes
✓Symbols
✓Dependencies
✓Structural diffs
✓Semantic context
✓Task-specific info only

3–15x lower cost. Same output quality.

How it works

Five layers of savings, applied automatically.

◈

Step 1

Parse & Understand

tree-sitter extracts symbols, functions, dependencies

→

≋

Step 2

Semantic Diff

Detect what changed and what's relevant

→

◎

Step 3

Symbol Retrieval

Pull only matching symbols from the index

→

⊙

Step 4

Prompt Caching

Stable context cached at 90% discount

→

⟁

Step 5

Smart Model Routing

route_model picks haiku, sonnet, or opus — ~4x savings on routine tasks

How we compare

Agent Booster vs CocoIndex

CocoIndex is a well-designed incremental indexing framework — great if you're building a custom pipeline. Agent Booster is the opposite: every layer ships pre-built, and you never write pipeline code.

◈

Agent Booster

Plug-and-play · MCP-native

⊛

CocoIndex

Framework · BYO pipeline

Purpose

✓

Plug-and-play token savings for AI coding agents

–

General-purpose code indexing ETL framework

MCP integration

✓

Built-in — 4 MCP tools, works in Claude Code / Cursor / Windsurf / Codex out of the box

–

None — you wire your own transport layer

Setup

✓

pip install + booster start — one command, fully bootstrapped

–

Write a pipeline in Python — chunking, embedding, storage are your job

Delta indexing

✓

Built-in — SHA-256 hash + mtime, skips unchanged files automatically

–

Core feature — incremental pipeline execution is CocoIndex's main idea

Asymmetric embeddings

✓

Built-in — passage: / query: E5 prefixes applied automatically

–

BYO — you configure the embedding function

Background daemon

✓

Built-in — Unix socket, keeps model warm, ~50ms search vs 2–3s cold start

–

None — no daemon concept

File watcher

✓

Built-in — watchdog, 2s debounce, auto re-index on every file save

–

Incremental pipelines (manual trigger or CI)

Model routing

✓

Built-in — route_model picks haiku/sonnet/opus by task complexity

–

Not in scope

Token tracking

✓

Built-in — booster gain reports savings per file, per session

–

Not in scope

Target user

✓

Developer who wants savings now, zero custom code

–

Engineer building a custom indexing pipeline or RAG system

CocoIndex is OSS and worth a look if you need a custom pipeline: cocoindex-io/cocoindex-code

Architecture

Three compounding layers of token savings.

What we add

Layer 3

Agent Booster

AST + semantic routing, smart file reads — routes only the relevant symbols and functions to the model instead of full files.

Layer 2

RTK — Rust Token Killer

Token compression on tool output — strips noise from CLI, git, build, and test output before it reaches the context window.

Layer 1

Prompt caching

Stable context reuse — native to Claude Code and the Anthropic API. Repeated stable prefixes are cached at a 90% discount.

Quickstart

Up and running in two commands.

Step 1 — Install

# includes embeddings + file watcher

pip install agent-booster[full]

Step 2 — Start

# detects Claude/Cursor/Codex, wires hooks, indexes, starts daemon

booster start

Detects which AI tools are present (Claude Code, Cursor, Windsurf, Codex), wires each one automatically, indexes the project, and starts a background daemon that keeps the model warm and auto-re-indexes on every file save. Fully reversible with booster remove claude.

That's it — then track savings

booster gain

What's new

v0.2.16 – v0.2.18

Daemon, delta indexing, and asymmetric embeddings.

Three releases shipped together. The result: booster start is the only command you need, search is instant after the first run, and re-indexing costs nothing on unchanged files.

⚡v0.2.17

Background daemon

booster start launches a persistent Unix socket process that keeps the embedding model loaded. search_context drops from 2–3 s cold-start to ~50 ms. Daemon survives editor restarts — it's not tied to any terminal.

◎v0.2.17

File watcher

watchdog monitors the project for writes. Changed files are re-indexed within 2 seconds of a save — no manual booster index during a coding session. Daemon handles this automatically.

≋v0.2.16

Delta indexing

SHA-256 hash and mtime stored per file in the SQLite index. Full re-index skips unchanged files entirely. Large repos that took seconds now finish in milliseconds. Use --force to override.

◈v0.2.16

Asymmetric embeddings

Index-time vectors use a passage: prefix; query-time vectors use query:. Follows the E5 paper's asymmetric retrieval approach. Retrieval accuracy improves meaningfully over symmetric embeddings, especially for short function names.

✦v0.2.18

booster start does everything

One command bootstraps the full stack: detects installed AI tools (Claude Code, Cursor, Windsurf, Codex), wires each one that isn't already wired, indexes the project, builds embeddings, and starts the daemon. On subsequent runs it just wakes the daemon.

Full changelog

Every commit, diff, and release note lives in the GitHub repo. PRs welcome.

View commits →

Claude Code Plugin

One command to install everything.

Agent Booster is available as an official Claude Code plugin. One slash command installs the MCP server, wires the context skill, and you're live.

Install via Claude Code

/plugin marketplace add sseshachala/conductai

◈

MCP server

Wires booster serve as an MCP server — smart_read, search_context, get_symbols, route_model available immediately.

✦

Context skill

Adds the booster-context skill so Claude prefers smart_read and search_context over native Read/Grep.

⊙

Zero config

No manual .mcp.json edits. Works in every Claude Code session opened in this directory.

Then start booster in your project

# indexes, embeds, and starts the daemon automatically

booster start

Detects which AI tools are installed, wires each one, runs booster index and booster embed if the index is missing, then starts the background daemon. Nothing to run manually.

Plugin is pending review at the Anthropic plugin directory. Until then, install directly from GitHub.

Example use cases

See the difference on real tasks.

These are actual patterns from everyday coding sessions — not synthetic benchmarks.

Fixing a bug in a large service file

You need to fix a validation bug in one function inside a 1,800-line API router.

97%

token savings

Without Booster

✕Claude reads the full 1,800-line file
✕~450 tokens just to locate the function
✕Full file re-sent on every follow-up turn

~450 tokens per read

With Booster

✓smart_read(file, "fix validation in create_order") returns 42 lines
✓Only the matching function + its direct dependencies
✓Same slice reused across follow-up turns via prompt cache

~12 tokens per read

MCP Tools

Four tools. Massive context savings.

Agent Booster exposes four MCP tools — targeted symbol lookups, semantic search, smart file reads, and automatic model routing. The model sees exactly what it needs and nothing else.

get_symbols(file)

Returns all functions and classes for a file from the booster index — no file read required.

search_context(task)

RRF-fused search across the full codebase — merges vector similarity and keyword ranks using Reciprocal Rank Fusion so strong keyword matches surface even when embeddings are weak. Falls back to keyword-only if embeddings aren't built.

smart_read(file, task)

Returns only the relevant AST symbol slices for a task using RRF-ranked selection. Applies a 5 KB gate — if matched symbols exceed 5 KB, trims to the top-3 ranked symbols with a truncation notice. Logs token savings to booster gain.

route_model(task, files?)

Recommends haiku, sonnet, or opus based on task complexity — keyword signals, file count, and symbol count. Saves ~4x on routine tasks by skipping unnecessary Opus calls.

CLI Reference

Every command, and when to use it.

Most of the time you only need two: start (does everything) and gain (shows savings).

booster start

Once per project — the only command you need

Bootstraps everything: detects installed AI tools, wires each one, indexes the project, and starts the background daemon. On subsequent runs, just wakes up the daemon.

booster start --foreground to run daemon in terminal

booster stop

When you're done or want to free memory

Sends SIGTERM to the daemon and waits for clean shutdown.

booster status

To check what's running

Shows daemon pid, uptime, model name, and file watcher state.

booster init <platform>

Manual wiring for a specific tool (optional — booster start does this automatically)

Writes MCP config, rules file, and hooks for claude, cursor, windsurf, or codex.

booster init claude --yes to skip prompt

booster remove <platform>

When you want to uninstall

Cleanly removes everything init wrote — MCP entry, rules block, hook script. No residue.

booster index

Manual re-index after a large refactor (daemon handles this automatically on file save)

Parses .py / .ts / .tsx / .js / .jsx files with tree-sitter. Skips unchanged files (delta indexing). Use --force to re-index everything.

booster index --force to bypass delta cache

booster embed

After manual booster index

Rebuilds sentence-transformer vector embeddings for all symbols. The daemon handles this automatically after file-save re-indexes.

booster route "<task>"

Before starting a non-trivial task

Recommends haiku, sonnet, or opus based on task complexity — keyword signals, file count, symbol count.

booster gain

Any time, to see ROI

Reports total smart_read calls, tokens served vs. saved, savings rate, and top files.

booster serve

Automatic — you rarely run this directly

Starts the MCP stdio server. Called automatically by Claude Code / Cursor / Windsurf / Codex.

Compatibility

Works with every major AI coding tool.

◈

Claude Code

booster init claude

⊙

Cursor

booster init cursor

◭

Windsurf

booster init windsurf

◎

OpenAI Codex

booster init codex

Each command shows exactly what files will change and asks for confirmation before writing anything. Run booster remove <platform> to cleanly undo.

Under the hood

How it's actually built.

Also by Conduct

More free tools for AI coding

Free · MIT

Claude Code Team Kit

Claude Code scaffold for any team

⬡

Production-ready Claude Code setup across all 5 layers — CLAUDE.md, skills, hooks, subagents, and plugins — pre-configured for Enterprise, SMB, and Startup personas.

View on GitHubbash install.sh

Free · MIT

RTK — Rust Token Killer

CLI output compression, 60–90% savings

≋

A transparent CLI proxy that strips token noise from git, build, test, and package manager output before it reaches the model. Just prefix any command with rtk.

Read the post →

All tools are free, MIT licensed, and live on GitHub. See the full list at /tools.

We're not optimizing models.
We're optimizing information flow.

A workflow that costs $2,500/day with brute-force context replay can often be reduced by several multiples — while maintaining comparable output quality. Every token you don't send is a token you don't pay for.

pip install agent-booster[full]

View on GitHub

Embeddings, file watcher, and daemon all included in the [full] extra.

Every token you don't send is a token you don't pay for.

Standing on the shoulders of sharp thinkers.

The biggest cost isn't the model. It's context replay.

The old way

The Booster way

Five layers of savings, applied automatically.

Agent Booster vs CocoIndex

Three compounding layers of token savings.

Up and running in two commands.

Daemon, delta indexing, and asymmetric embeddings.

One command to install everything.

See the difference on real tasks.

Four tools. Massive context savings.

Every command, and when to use it.

Works with every major AI coding tool.

How it's actually built.

More free tools for AI coding

Claude Code Team Kit

RTK — Rust Token Killer

We're not optimizing models.We're optimizing information flow.

We're not optimizing models.
We're optimizing information flow.