Python · MIT · Open Source

Stop paying for tokens
your model ignores.

ContextPilot intercepts your LLM calls, compresses redundant context through a quality-gated pipeline, and forwards the optimal payload — without touching your code.

$ pip install contextpilot-ai

View on GitHub See how it works

30–70%

cost reduction

<50ms

at 100K tokens

integration surfaces

prompt changes

§01 How it works

Intercept

Wraps your existing API client transparently

›

Analyze

Scores each block: staleness, redundancy, relevance, density

›

Compress

History summary · system dedup · RAG pruning · structural strip

›

Gate

Quality score <85 → original payload forwarded, nothing breaks

›

Send

Compressed payload forwarded to your provider

Before 24,800 tokens

# system prompt — repeated every turn

████████████████████████████████

████████████████████████████

# conversation history (turns 1–12)

████████████████████████████

█████████████████████████████████

████████████████████████

█████████████████████████████████

████████████████████████████

# RAG chunks (12 retrieved, 4 relevant)

██████████████████████████████

██████████████████████

█████████████████████████████████

██████████████████████████

After ContextPilot 9,100 tokens

# system prompt — deduped, sent once

██████████████████

████████████████

# compressed history summary

████████████

█████████████████

██████████

# RAG chunks (4 relevant only)

██████████████

████████████

████████████████

# current turn — unchanged

███████████████████████

↓ 63% reduction · quality score 94/100

§02 Integration surfaces

A · Library

Python SDK wrapper

Drop-in for OpenAI, Anthropic, and Google Vertex AI. One call wraps your existing client. Zero prompt changes required.

import contextpilot_ai

import anthropic

client = anthropic.Anthropic()

client = contextpilot_ai.wrap(client)

# existing calls unchanged

B · Proxy

Local proxy server

For AI coding tools — Claude Code, Aider, Cursor. Set one env var and ContextPilot intercepts transparently. OpenAI-compatible.

# start proxy

$ contextpilot proxy --port 8432

# point your tool at it

$ export ANTHROPIC_BASE_URL=\

http://localhost:8432

C · MCP

MCP server

Native MCP tools inside Claude Desktop and Claude Code. Exposes optimize_context, get_savings, and suggest_config.

# start MCP server

$ contextpilot mcp

# claude.json

{ "contextpilot": {

"command": "contextpilot mcp" }}

D · CLI

AST migration agent

For existing codebases with 50+ LLM calls. Scans, wraps, and patches automatically. Dry-run before applying.

# preview changes

$ contextpilot migrate ./src/ --dry-run

# apply when satisfied

$ contextpilot migrate ./src/ --apply

§03 Engineering principles

ZERO-TRUST PAYLOAD

Your prompts never leave.

Telemetry sends only numeric metadata — token counts, latency, scores. Never prompt text, response content, or PII.

FAIL-SAFE BY DEFAULT

Nothing breaks. Ever.

If compression fails or quality score drops below 85, the original payload is forwarded unmodified. No exceptions.

PROVIDER-AGNOSTIC

Works with your stack.

OpenAI, Anthropic, Google Vertex AI, Portkey, Helicone. ContextPilot is middleware — not a router, not a lock-in.

Stop paying for tokensyour model ignores.

Get early access.

Stop paying for tokens
your model ignores.