Open source, MIT licensed

Track what your
AI agents spend.

Set a dollar limit per key, session, or user. Requests get rejected before they reach the provider.

Get started free View source

by @smigolsmigol

# wrap any command

$ npx @f3d1/llmkit-cli -- python agent.py

... agent runs normally ...

claude-sonnet-4 $0.0847 1,204 in / 380 out cache saved $0.31

gpt-4.1-mini $0.0182 890 in / 241 out

Session total: $4.12 / $50.00 budget 14 reqs in 38s

MCP Server

11 tools for cost tracking inside your IDE. 5 work locally by reading Claude Code, Cursor, and Cline session data. No account needed.

Learn more ->

API Gateway

Budget enforcement that actually blocks requests. Reservation pattern: estimate before, reject if over, settle after. Per-key and per-session limits.

Get started ->

Dashboard

Spend by model, provider, and session. Request log with full cost breakdown. API key management, budget configuration, anomaly detection.

Try it free ->

11 providers. 730+ models priced. Cache-aware pricing that tracks read and write tokens separately.

Anthropic

29 models

OpenAI

145 models

Google Gemini

50 models

xAI Grok

39 models

DeepSeek

6 models

Groq

37 models

Mistral

63 models

Together

105 models

Fireworks

257 models

Ollama

local

OpenRouter

meta-gateway

Budget enforcement

Cost is reserved before the request. Exceeded means rejected, not logged after the fact.

Accurate costs

Prompt caching makes tokens up to 90% cheaper. We track cached and uncached separately.

Open source

MIT licensed. Self-host on Cloudflare Workers free tier. Your keys stay in your infra.

Free while in beta. No credit card.

Try the dashboard View source

MIT licensed. Built with Claude Code. Source on GitHub