Intelligent routing across 70+ models
Six routing lanes pick the right model for each task — premium reasoning for hard specs, code-specialized models for bulk work, Haiku-class models for codebase queries.
A drop-in replacement for the Claude Code and ChatGPT API keys. Built on intelligent routing across Claude, OpenAI, DeepSeek, Kimi, Qwen, Nova, and more — all running on AWS Bedrock inside your account.
What it does
Six routing lanes pick the right model for each task — premium reasoning for hard specs, code-specialized models for bulk work, Haiku-class models for codebase queries.
Prompt caching, idempotency, fallback chains, and right-sized routing compound. Each request shows its lane, model, and dollar cost in the response header.
Issue API keys with monthly budgets, rate limits, and model allowlists. A global $50/month hard cap auto-disables Bedrock at the AWS layer — no surprise bills.
Runs entirely on your AWS account. No third-party SaaS in the data path. AES-256 at rest. Cancel anything, anytime — your conversations, code, and metrics never leave.
Use the openai SDK, LangChain, Cursor, Continue.dev, OpenCode — anything that speaks the OpenAI API. Change one base_url and you're done.
Model used, tokens in & out, cache hits, fallback triggers, dollar cost — returned as response headers and visible in the per-user dashboard. No mystery spend.
Access is granted through your account. No loose keys or passwords lying around. Every request is routed, optimized, and accounted for — under one roof.