Tool Definition Compression

Reference to Cloudflare's Code Mode approach for compressing tool definitions on input side, complementing context-mode's output side compression

Managing the volume of information in the Model Context Protocol requires a two-pronged approach that distinguishes between "input side" tool definitions and "output side" data results. While Cloudflare’s Code Mode tackles the initial bloat of loading dozens of complex tool schemas, solutions like context-mode focus on sandboxing the massive token dumps that occur when tools return raw payloads like git logs or browser snapshots. Because users often struggle with tool bloat—frequently loading over 80 tools by default—compressing these definitions is essential to prevent "token burn" before a task even begins. Ultimately, combining schema compression with summarized tool outputs allows developers to maintain a broad toolkit without overwhelming the model's context window.

View on HN · Topics

Small suggestion: Link to the Cloudflare Code mode post[0] in the blog post where you mentio it. It's linked in the README, but when I saw it in the blog post, I had to Google it.

[0] https://blog.cloudflare.com/code-mode-mcp/

View on HN · Topics

Right, context-mode doesn't change how MCP tool definitions get loaded into context. That's the "input side" problem that Cloudflare's Code Mode tackles by compressing tool schemas. Context-mode handles the "output side," the data that comes back from tool calls. That said, if you're writing your own MCPs, you could apply the same pattern directly. Instead of returning raw payloads, have your MCP server return a compact summary and store the full output somewhere queryable. Context-mode just generalizes that so you don't have to rebuild it per server.

View on HN · Topics

That's a fair point and honestly the ideal approach. But in practice most people don't hand-curate their MCP server list per task. They install 5-6 servers and suddenly have 80 tools loaded by default. Context-mode doesn't solve the tool definition bloat, that's the input side problem. It handles the output side, when those tools actually run and dump data back. Even with a focused set of tools, a single Playwright snapshot or git log can burn 50k tokens. That's what gets sandboxed.

Summarizer