My AI tools were eating 23,000 tokens before doing any actual work.

MCP servers load their full schema on every prompt. 1,000-10,000 tokens per server, and you’re paying before the model even starts thinking.

I refactored my setup from monolithic MCP servers to Progressive Skills — a framework where only metadata loads upfront (50 tokens), and full documentation loads on demand.

Results:

Notion: 9,600 → 600 tokens (94% reduction)
Telegram: 3,500 → 400 tokens (89% reduction)
Total: 23,000 → 6,900 tokens (70% saved)

The approach: each skill is a small SKILL.md entry point (15-25 lines) with scripts/ and references/ loaded only when needed. One skill, all tools — works across Claude Code, Codex, and Cursor via symlinks.

If you’re running multiple MCP servers and wondering why your context window feels cramped, this might be why.

All posts

Share on LinkedIn Share on Telegram