Python · MIT · Open Source

Stop paying for tokens
your model ignores.

ContextPilot intercepts your LLM calls, compresses redundant context through a quality-gated pipeline, and forwards the optimal payload — without touching your code.

$ pip install contextpilot-ai
View on GitHub See how it works
30–70%
cost reduction
<50ms
at 100K tokens
4
integration surfaces
0
prompt changes
§01 How it works
01
Intercept
Wraps your existing API client transparently
02
Analyze
Scores each block: staleness, redundancy, relevance, density
03
Compress
History summary · system dedup · RAG pruning · structural strip
04
Gate
Quality score <85 → original payload forwarded, nothing breaks
05
Send
Compressed payload forwarded to your provider
Before 24,800 tokens
# system prompt — repeated every turn
████████████████████████████████
████████████████████████████
# conversation history (turns 1–12)
████████████████████████████
█████████████████████████████████
████████████████████████
█████████████████████████████████
████████████████████████████
# RAG chunks (12 retrieved, 4 relevant)
██████████████████████████████
██████████████████████
█████████████████████████████████
██████████████████████████
After ContextPilot 9,100 tokens
# system prompt — deduped, sent once
██████████████████
████████████████
# compressed history summary
████████████
█████████████████
██████████
# RAG chunks (4 relevant only)
██████████████
████████████
████████████████
# current turn — unchanged
███████████████████████
↓ 63% reduction · quality score 94/100
§02 Integration surfaces
A · Library
Python SDK wrapper
Drop-in for OpenAI, Anthropic, and Google Vertex AI. One call wraps your existing client. Zero prompt changes required.
import contextpilot_ai
import anthropic
 
client = anthropic.Anthropic()
client = contextpilot_ai.wrap(client)
# existing calls unchanged
B · Proxy
Local proxy server
For AI coding tools — Claude Code, Aider, Cursor. Set one env var and ContextPilot intercepts transparently. OpenAI-compatible.
# start proxy
$ contextpilot proxy --port 8432
 
# point your tool at it
$ export ANTHROPIC_BASE_URL=\
http://localhost:8432
C · MCP
MCP server
Native MCP tools inside Claude Desktop and Claude Code. Exposes optimize_context, get_savings, and suggest_config.
# start MCP server
$ contextpilot mcp
 
# claude.json
{ "contextpilot": {
"command": "contextpilot mcp" }}
D · CLI
AST migration agent
For existing codebases with 50+ LLM calls. Scans, wraps, and patches automatically. Dry-run before applying.
# preview changes
$ contextpilot migrate ./src/ --dry-run
 
# apply when satisfied
$ contextpilot migrate ./src/ --apply
§03 Engineering principles
ZERO-TRUST PAYLOAD
Your prompts never leave.
Telemetry sends only numeric metadata — token counts, latency, scores. Never prompt text, response content, or PII.
FAIL-SAFE BY DEFAULT
Nothing breaks. Ever.
If compression fails or quality score drops below 85, the original payload is forwarded unmodified. No exceptions.
PROVIDER-AGNOSTIC
Works with your stack.
OpenAI, Anthropic, Google Vertex AI, Portkey, Helicone. ContextPilot is middleware — not a router, not a lock-in.
§04 Dashboard · coming soon

Get early access.

The ContextPilot dashboard shows real-time token savings, before/after comparisons, cost projections by model, and A/B shadow test results. One email when it's ready.

✓ You're on the list.

No spam. Unsubscribe anytime.