MCP doesn't need tools, it needs code

lucumr.pocoo.org

226 points by the_mitsuhiko 3 days ago


sam0x17 - 3 days ago

Yeah I quite agree with this take. I don't understand why editors aren't utilizing language servers more for making changes. Crazy to see agents running grep and sed and awk and stuff, all of that should be provided through a very efficient cursor-based interface by the editor itself.

And for most languages, they shouldn't even be operating on strings, they should be operating on token streams and ASTs

jumploops - 3 days ago

The promise of MCP is that it “connects your models with the world”[0].

In my experience, it’s actually quite the opposite.

By giving an LLM a set of tools, 30 in the Playwright case from the article, you’re essentially restricting what it can do.

In this sense, MCP is more of a guardrail/sandbox for an LLM, rather than a superpower (you must choose one of these Stripe commands!).

This is good for some cases, where you want your “agent”[1] to have exactly some subset of tools, similar to a line worker or specialist.

However it’s not so great when you’re using the LLM as a companion/pair programmer for some task, where you want its output to be truly unbounded.

[0]https://modelcontextprotocol.io/docs/getting-started/intro

[1]For these cases you probably shouldn’t use MCP, but instead define tools explicitly within one context.

juanviera23 - 3 days ago

I agree MCP has these flaws, idk why we need MCP servers when LLMs can just connect to the existing API endpoint

Started on working on an alternative protocol, which lets agents call native endpoints directly (HTTP/CLI/WebSocket) via “manuals” and “providers,” instead of spinning up a bespoke wrapper server: https://github.com/universal-tool-calling-protocol/python-ut...

even connects to MCP servers

if you take a look, would love your thoughts

yxhuvud - 3 days ago

First rule of writing about something that can be abbreviated: First have some explanation so people have an idea of what you are talking about. Either type out what the abbreviation stands for, have an explanation or at least a link to some other page that explain what is going on.

EDIT: This has since been fixed in link, so it is outdated.

xavierx - 3 days ago

Is this just code injection?

It’s talking about passing Python code in that would have a Python interpreter tool.

Even if you had guardrails setup that seems a little chancery, but hey this is the time of development evolution where we’re letting AI write code anyway, so why not give other people remote code execution access, because fuck it all.

scosman - 3 days ago

I made a MCP server that tries to address some of these (undocumented, security, discoverability, platform specific). You write a yaml describing your tools (lint/format/test/build), and it exposes them to agents MCP. Kinda like package.json scripts but for agents. Speeds things up too, fewer incorrect commands, no human approval needed, and parallel execution.

https://github.com/scosman/hooks_mcp

The interactive lldb session here is super cool for deeper debugging. For security, containers seem like the solution - sketch.dev is my fav take on containerizing agents at the moment.

preek - 3 days ago

Re Security: I put my AI assistant in a sandbox. There, it can do whatever it wants, including deleting or mutating anything that would otherwise be harmful.

I wrote about how to do it with Guix: https://200ok.ch/posts/2025-05-23_sandboxing_ai_tools:_how_g...

Since then, I have switched to using Bubblewrap: https://github.com/munen/dotfiles/blob/master/bin/bin/bubble...

xmorse - 3 days ago

This is how tools are implemented in latest Gemini models like gemini-2.5-flash-preview-native-audio-dialog: the LLM has access to a code execution tool that can run code in python and all tools are available in a default_api class

CharlieDigital - 3 days ago

A few weeks back, I actually started working on an MCP server that is designed to let the LLM generate and execute JavaScript in a safe, sandboxed C# runtime with Jint as the interpreter.

https://github.com/CharlieDigital/runjs

Lets the LLM safely generate and execute whatever code it needs. Bounded by statement count, memory limits, and runtime limits.

It has a built in secrets manager API (so generated code can make use of remote APIs) can, HTTP fetch analogue, JSONPath for JSON handling, and Polly for HTTP request resiliency.

larve - 3 days ago

codeact is a really interesting area to explore. I expanded upon the JS platform I started sketching out in https://www.youtube.com/watch?v=J3oJqan2Gv8 . LLMs know a million APIs out of the box and have no trouble picking more up through context, yet struggle once you give them a few tools. In fact just enabling a single tool definition "degrades" the vibes of the model.

Give them an eval() with a couple of useful libraries (say, treesitter), and they are able not only to use it well, but to write their own "tools" (functions) and save massively on tokens.

They also allow you to build "ephemeral" apps, because who wants to wait for tokens to stream and a LLM to interpret the result when you could do most tasks with a normal UI, only jumping into the LLM when fuzziness is required.

Most of my work on this is sadly private right now, but here's a few repos github.com/go-go-golems/jesus https://github.com/go-go-golems/go-go-goja that are the foundation.

BLanen - 3 days ago

What this is saying is again, that MCP is not a protocol. Which is the point of MCP, making it essentially worthless because it doesn't define actual behavioral rules, it can only describe existing rules informally.

This is because defining a formal system, that can do everything MCP promises to enable, is a logical impossibility.

pploug - 2 days ago

While I generally agree with the author on code over tools, the article could have benefitted from some concrete ways that this could have potentially been done somewhat securely by sandboxing, enforcing zero trust, network segmentation, and all the other known controls we've developed over the last decade.

I love the optimism of this space, but fear that the "security is a sham" attitude will bite us all in the ass down the line.

kordlessagain - 3 days ago

As one does, I've built an alternative to MCP: https://ahp.nuts.services

Put GPT5 into agent mode then give it that URL and the token 'linkedinPROMO1' and once it loads the tools tell it to use curl in a terminal (it's faster) and then run the random tool.

This is authenticated at the moment with that token, plus bearer tokens, but I've got the new auth system up and its working. I still have to do the integration with all the other services (the website, auth, AHP and the crawler and OCR engine), so will be a while before all that's done.

ryukafalz - 2 days ago

As ever, I think the answer to "how do we sandbox arbitrary code while still letting it do useful things?", whether human-written or machine-written, is with object capabilities. Run the generated code in a sandbox, but pass in capabilities to useful resources, whether that be remote servers, local directories, or whatever else. Then you know the bounds of what trouble it can get up to from the start.

philipp-gayret - 3 days ago

Agree on that it should be composable. Even better if MCP tooling wouldn't yield huge amounts of output that pollutes the context and the output of one can be input to the next, so indeed that may as well be code.

Would be nice if there was a way for agents to work with MCPs as code, preview or debug the data flowing through them. At the moment it all seems not a mature enough solution and Id rather mount a Python sandbox with API keys to what it needs than connect an MCP tool on my own machine.

PhilipRoman - 3 days ago

Can't wait until I can buy a H100 with a DisplayPort input and USB keyboard and mouse output and just let it figure everything out.

abtinf - 3 days ago

I’ve posted this before[1], and have searched, but still haven’t found it: I wish someone would write a clear, crisp explanation for why MCP is needed over simply supporting swagger or proto/grpc.

[1] https://news.ycombinator.com/item?id=44848489

skerit - 3 days ago

I don't get it. Tools are a way to let LLMs do something via what is essentially an API. Is it limited? Yes, it is. By design.

Sure in some cases it might be overkill and letting the assistant write & execute plain code might be best. There are plenty of silly MCP servers out there.

s1mplicissimus - 3 days ago

I tried doing the MCP approach with about 100 tools, but the agent picks the wrong tool a lot of the time and it seems to have gotten significantly worse the more tools I added. Any ideas how to deal with this? Is it one of those unsolvable XOR-like problems maybe?

throwmeaway222 - 3 days ago

problem with MCP right now is that LLMs don't natively know what it is

an LLM natively knows bash and how to run things

MCP is forcing a weird set of non-normal rules that most of the writing of the web doesn't support. Most of the web writes a lot about bash and getting things done.

Maybe in a few years LLMs will "natively" understand them, but I see MCP more as a buzzword right now.

giltho - 3 days ago

Imagine 50 years of computer security to have articles come up on hackernews saying “what you need is to allow a black box to run arbitrary python code” :(

turnsout - 3 days ago

> One surprisingly useful way of running an MCP server is to make it an MCP server with a single tool (the ubertool) which is just a Python interpreter that runs eval() with retained state.

Wow, you better be sure you have that Python environment locked down.

PhilKunz - 3 days ago

How come noone mentioned serena MCP here until now :D

evrennetwork - 3 days ago

[dead]

laser_eagle - 3 days ago

[dead]

faangguyindia - 3 days ago

Here is why MCP is bad, here i am trying to use MCP to build a simple node cli tool to fetch documentation from Context7: https://pastebin.com/raw/b4itvBu4 And it doesn't work even after 10 attemps.

Fails and i've no idea why, meanwhile python code works without issues but i can't use that one as it conflicts with existing dependencies in aider, see: https://pastebin.com/TNpMRsb9 (working code after 5 failed attempts)

I am never gonna bother with this again, it can be built as a simple rest API, why we even need this ugly protocol?