Show HN: An MCP Gateway to block the lethal trifecta

github.com

44 points by 76SlashDolphin 18 hours ago


Hi there, me and some friends were inspired by Simon Willison's recent post on the "lethal trifecta" (https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/ ) and started building a gateway to defend against it.

The idea: instead of connecting an LLM directly to multiple MCP servers, you point them all through a Gateway.

The Gateway:

- Connects to each MCP server and inspects their tools + requirements

- Classifies tools along the "trifecta" axes (private data access, untrusted content, external comms)

- When all three conditions are about to align in a single session, the Gateway blocks the last step and tells the LLM to show a warning instead.

That way, before anything dangerous can happen, the user is nudged to review the situation in a web dashboard.

We'd love for the HN community to try it out: https://github.com/Edison-Watch/open-edison

Any feedback very welcome - we'll be around in the thread to answer questions.

bradleybuda - 16 hours ago

I think the "lethal trifecta" framing is useful and glad that attempts are being made at this! But there are two big, hard-to-solve problems here:

1. The "lethal trifecta" is also the "productive trifecta" - people want to be able to use LLMs to operate in this space since that's where much of the value is; using private / proprietary data to interact with (do I/O with) the real world.

2. I worry that there will soon be (if not already) a fourth leg to the stool - latent malicious training within the LLMs themselves. I know the AI labs are working on this, but trying to ferret out Manchurian Candidates embedded within LLMs may very well be the greatest security challenge of the next few decades.

aaronharnly - 17 hours ago

"without risk", "solves", and "Guaranteed" are big words – you might want to temper them.

sebastiennight - 14 hours ago

I'm trying to wrap my head around this:

1. How are you defending against the case of one MCP poisoning your firewall LLM into incorrectly classifying other MCP tools?

2. How would you make sure the LLM shows the warning, as they are non-deterministic?

3. How clear do you expect MCP specs in order for your classification step to be trustworthy? To the best of my knowledge there is no spec that outlines how to "label" a tool for the 3 axes, so you've got another non-deterministic step here. Is "writing to disk" an external comm? It is if that directory is exposed to the web. How would you know?

doctoboggan - 16 hours ago

Wouldn't the LLM running in the gateway also be susceptible to the same jailbreaks?

pamelafox - 14 hours ago

How do you determine if the tools access private data? Is it based solely on their tool description (which can be faked) or by trying them in a sandboxed environment or by analyzing the code?

datadrivenangel - 14 hours ago

So is any combination of MCP servers basically going to require human in the loop approval for everything?

Sounds like it defeats the point.

warthog - 18 hours ago

Seen a hack using whatsapp mcp recently - this seems promising