Agent Safehouse – macOS-native sandboxing for local agents

387 points by atombender 7 hours ago

Creator here - didn't expect this to go public so soon. A few notes:

1. I built this because I like my agents to be local. Not in a container, not in a remote server, but running on my finely-tuned machine. This helps me run all agents on full-auto, in peace.

2. Yes, it's just a policy-generator for sandbox-exec. IMO, that's the best part about the project - no dependencies, no fancy tech, no virtualization. But I did put in many hours to identify the minimum required permissions for agents to continue working with auto-updates, keychain integration, and pasting images, etc. There are notes about my investigations into what each agent needs https://agent-safehouse.dev/docs/agent-investigations/ (AI-generated)

3. You don't even need the rest of the project and use just the Policy Builder to generate a single sandbox-exec policy you can put into your dotfiles https://agent-safehouse.dev/policy-builder.html

atombender - 5 hours ago

OP here. Sorry if this was premature. I came across it through your earlier comment on HN, started using it (as did a colleague), and we've been impressed enough with how efficient it is that I decided it deserved a post!
I've seen sandbox policy documents for agents before, but this is the first ready-to-use app I've come across.
I've only had a couple of points of friction so far:
- Files like .gitconfig and .gitignore in the home folder aren't accessible, and can't be made accessible without granting read only access to the home folder, I think?
- Process access is limited, so I can't ask Claude to run lldb or pkill or other commands that can help me debug local processes.
More fine-grained control would be really nice.
- e1g - 5 hours ago
  
  Love the feedback -
  For handling global rules (like ~/.gitconfig and ~/.gitignore), I keep a local policy file that whitelists my "shared globals" paths, and I tell Safehouse to include that policy by default. I just updated the README with an example that might be useful[1]. I also enabled access to ~/.gitignore by default as it's a common enough default.
  For process management, there is a blurry line about how much to allow without undermining the sandboxing concept. I just added new integrations[2] to allow more process control and lldb, but I don't know this area well. You can try cloning the repo, asking your agents to tweak the rules in the repo until your use-case works, and send a PR - I'll merge it!
  Alternatively, using the "custom policy" feature above, you can selectively grant broad access to your tools (you can use log monitoring to see rejections, and then add more permisions into the policy file)
  [1] https://github.com/eugene1g/agent-safehouse?tab=readme-ov-fi...
  [2] https://github.com/eugene1g/agent-safehouse/pull/7
  - atombender - 4 hours ago
    
    That is very useful. I wasn't sure if I could supply my own override list or how I would even format one, but this solves that problem!
    The process control policy, that's kind of niche and should definitely not be something agents are always allowed to do, so having a shorthand flag like you added in that pull request is the right choice.
    I'm sure Anthropic and the other major players will catch up and add better sandboxing eventually, but for now, this tool has been exactly what I needed — many thanks!
    I also wonder if this could have be a plugin or MCP server? I was using this plugin [1] for a bit, and it appears to use a "PreToolUse" that modifies every tool invocation. The benefit here would be that you could even change the Safehouse settings inside a session, e.g. turn process control on or off.
    [1] https://mksg.lu/blog/context-mode
TheBengaluruGuy - 5 hours ago

I'm wondering if this could be adapted for openclaw. Running it in a machine that's accessible reduces friction and enables a lot of use-cases but equally hard to control/restrict it
asabla - 6 hours ago

Oh woah!
I've been trying to get microsandbox to play nicely. But this is much closer to what I actually need.
I glimpsed through the site and the script. But couldn't really see any obvious gotchas.
Any you've found so far which hasn't been documented yet?
- e1g - 6 hours ago
  
  Pure TUI is solid - I’ve been running all my pets inside that cage for several weeks with no issues. Auto-updates work, session renewals work, config updates work etc.
  But lately I’ve been using agents to test via browsers, and starting headless browsers from the agent is flakey. I’m working on that but it’s hard to find a secure default to run Chrome.
  In the repo, I have policies for running the Claude desktop app and VSCode inside the same sandbox (so can do yolo mode there too), so there is hope for sandboxing headless Chrome as well.
  - asabla - 5 hours ago
    
    Yee I gotcha.
    Did a migration myself last week from using playwright mcp towards playwright-cli instead. Which has been playing much nicer so far. I guess you would run into the same issues you've already mentioned about running chrome headless in one of these sandboxes.
    I'll for sure keep an eye out for updates.
    Kudos to the project!
    
    e1g - 3 hours ago
    
    playwright-cli works out of the box, and I just merged support for agent-browser. If you end up testing out Safehouse, and have any issues, just create an issue on GitHub, and I'll check it out. Browser usage is definitely among my use cases.
dionian - 29 minutes ago

i toyed around with policy builder for a few seconds, i was really impressed. great UX
quietsegfault - an hour ago

What’s the difference between running natively and in a container, really?
- cortesoft - 4 minutes ago
  
  On Linux, not much. On a Mac, quite a bit.
siwatanejo - 2 hours ago

It's kinda funny that I, being skeptical about coding agents and their potential dangers, was interested to give your project a go because I don't trust AI.
Yet the first thing I find in your README is that to install your tool I need to trust some random server serve me an .sh file that I will execute in my computer (not sure if with sudo... but still).
Come on man, give me a tarball :)
EDIT: PS: before someone gives me the typical "but you could have malware in that tarball too!!!", well, it's easier to inspect what's inside the tarball and compare it to the sources of the repo, maybe also take a look at the CI of the repo to see if the tarball is really generated automatically from the contents of the repo ;)
- e1g - 2 hours ago
  
  Fair! You don’t actually need to install anything and can just generate a text file with the security profile for sandbox-exec. You can do that online at https://agent-safehouse.dev/policy-builder.html
  Alternatively, you can feed these instructions to your LLM and have it generate you a minimal policy file and a shell wrapper https://agent-safehouse.dev/llm-instructions.txt
  - oneplane - 2 hours ago
    
    That online builder is very cool, well done!
    I've been trying out similar things to help internal teams to use systems and languages like Rego (for Open Policy Agent) to have a visual and more 'a la carte' experience when starting out, so they don't have to jump straight to learning all syntax and patterns for a language they might have never seen before.
- Quiark - 2 hours ago
  
  Usually it takes less than 5 minutes to review the shell script that downloads stuff.

simonw - 3 hours ago

The challenge I'm finding with sandboxes like this is evaluating them in comparison to each other.

This looks like a competent wrapper around sandbox-exec. I've seen a whole lot of similar wrappers emerging over the past few months.

What I really need is help figuring out which ones are trustworthy.

I think this needs to take the form of documentation combined with clearly explained and readable automated tests.

Most sandboxes - including sandbox-exec itself - are massively under-documented.

I am going to trust them I need both detailed documentation and proof that they work as advertised.

e1g - 3 hours ago

Thank you for your work - I have sent many of your links to my people.
Your point is totally fair for evaluating security tooling. A few notes -
1. I implemented this in Bash to avoid having an opaque binary in the way.
2. All sandbox-exec profiles are split up into individual files by specific agent/integration, and are easily auditable (https://github.com/eugene1g/agent-safehouse/tree/main/profil...)
3. There are E2E tests validating sandboxing behavior under real agents
4. You don't even need the Safehouse Bash wrapper, and can use the Policy Builder to generate a static policy file with minimal permissions that you can feed to sandbox-exec directly (https://agent-safehouse.dev/policy-builder). Or feed the repo to your LLMs and have them write your own policy from the many examples.
5. This whole repo should be a StrongDM-style readme to copy&paste to your clanker. I might just do that "refactor", but for now added LLM instructions to create your own sandbox-exec profiles https://agent-safehouse.dev/llm-instructions.txt
- big_toast - 22 minutes ago
  
  I love this implementation. Do you find the SBPL deficient in any ways?
  Would xcodebuild work in this context? Presumably I'd watch a log (or have an agent) and add permissions until it works?

zmmmmm - 5 hours ago

This is great to see.

I honestly think that sandboxing is currently THE major challenge that needs to be solved for the tech to fully realise its potential. Yes the early adopters will YOLO it and run agents natively. It won't fly at all longer term or in regulated or more conservative corporate environments, let alone production systems where critical operations or data are in play.

The challenge is that we need a much more sophisticated version of sandboxing than anybody has made before. We can start with network, file system and execute permissions - but we need way more than that. For example, if you really need an agent to use a browser to test your application in a live environment, capture screenshots and debug them - you have to give it all kinds of permissions that go beyond what can be constrained with a traditional sandboxing model. If it has to interact with resources that cost money (say, create cloud resources) then you need an agent aware cloud cost / billing constraint.

Somehow all this needs to be pulled together into an actual cohesive approach that people can work with in a practical way.

andybak - 4 hours ago

> solved
Have you considered that it's unsolvable? Or - at least - there is an irreconcilable tension between capability and safety. And people will always choose the former if given the choice.
- skybrian - an hour ago
  
  I don't know about solved, but I've seen some interesting ideas for making it safer, so I think it could be improved.
  One idea is to have the coding agent write a security policy in plan mode before reading any untrusted files:
  https://dystopiabreaker.xyz/fsm-prompt-injection
silverstream - 4 hours ago

File-level sandboxing is table stakes at this point — the harder problem is credentials and network. An agent inside sandbox-exec still has your AWS keys, GitHub token, whatever's in the environment. I've been running a setup where a local daemon issues scoped short-lived JWTs to agent processes instead of passing raw credentials through, so a confused agent can't escalate beyond what you explicitly granted. Works well for API access. But like you said, nothing at the filesystem level stops an agent from spinning up 50 EC2 instances on your account.
- ericlevine - 3 hours ago
  
  Completely agree. As soon as I had OpenClaw working, I realized actually giving it access to anything was a complete nonstarter after all of the stories about going off the rails due to context limitations [1]. I've been building a self-hosted open sourced tool to try to address this by using an LLM to police the activity of the agent. Having the inmates run the asylum (by having an LLM police the other LLM) seemed like an odd idea, but I've been surprised how effective it's been. You can check it out here if you're curious: https://github.com/clawvisor/clawvisor clawvisor.com
  [1] https://www.tomshardware.com/tech-industry/artificial-intell...
- e1g - 4 hours ago
  
  > An agent inside sandbox-exec still has your AWS keys, GitHub token, whatever's in the environment.
  That's not the case with Agent Safehouse - you can give your agent access to select ~/.dotfiles and env, but by default it gets nothing (outside of CWD)

xyzzy_plugh - 7 hours ago

This is just a wrapper around sandbox-exec. It's nice that there are a ton of presets that have been thought out, since 90% of wielding sandbox-exec is correctly scoping it to whatever the inner environment requires (the other 90% is figuring out how sandbox-exec works).

I like that it's just a shell script.

I do wish that there was a simple way to sandbox programs with an overlay or copy-on-write semantics (or better yet bind mounts). I don't care if, in the process of doing some work, an LLM agent modifies .bashrc -- I only care if it modifies _my_ .bashrc

e1g - 6 hours ago

Thanks, I picked Bash because I’m scared of all Go and Rust binaries out there!
Re “overlay FS” - I too wish this was possible on Macs, but the closest I got was restricting agents to be read-only outside of CWD which, after a few turns, bullies them into working in $TMP. Not the same though.
dbmikus - 5 hours ago

I've been working on an OSS project, Amika[1], to quickly spin up local or remote sandboxes for coding workloads. We support copy-on-write semantics locally (well, "copy-and-then-write" for now... we just copy directories to a temp file-tree).
It's tailored to play nicely with Git: spin up sandboxes form CLI, expose TCP/UDP ports of apps to check your work, and if running hosted sandboxes, share the sandbox URLs with teammates. I basically want running sandboxed agents to be as easy as `git clone ...`.
Docs are early and edges are rough. This week I'm starting to dogfood all my dev using Amika. Feedback is super appreciated!
FYI: we are also a startup, but local sandbox mgmt will stay OSS.
[1]: https://github.com/gofixpoint/amika
- xyzzy_plugh - 5 hours ago
  
  This is just a thin wrapper over Docker. It still doesn't offer what I want. I can't run macOS apps, and if I'm doing any sort of compilation, now I need a cross-compile toolchain (and need to target two platforms??).
  Just use Docker, or a VM.
  The other issue is that this does not facilitate unpredictable file access -- I have to mount everything up front. Sometimes you don't know what you need. And even then copying in and out is very different from a true overlay.
  - dbmikus - 4 hours ago
    
    Appreciate the deets!
    It sounds like a big part of your use case is to safely give an agent control of your computer? Like, for things besides codegen?
    We're probably not going to directly support that type of use case, since we're focused on code-gen agents and migrating their work between localhost and the cloud.
    We are going to add dynamic filesystem mounting, for after sandbox creation. Haven't figured out the exact implementation yet. Might be a FUSE layer we build ourselves. Mutagen is pretty interesting as well here.
divmain - 6 hours ago

This is what I was going for with Treebeard[0]. It is sandbox-exec, worktrees, and COW/overlay filesystem. The overlay filesystem is nice, in that you have access to git-ignored files in the original directory without having to worry about those files being modified in the original (due to the COW semantics). Though, truthfully, I haven’t found myself using it much since getting it all working.
[0] https://github.com/divmain/treebeard
- xyzzy_plugh - 5 hours ago
  
  This approach is too complex for what is provided. You're better off just making a copy of the tree and simply using sandbox-exec. macFUSE is a shitshow.
  The main issue I want to solve is unexpected writes to arbitrary paths should be allowed but ultimately discarded. macOS simply doesn't offer a way to namespace the filesystem in that way.
  - divmain - a minute ago
    
    Completely agree; my approach was not the most practical. I mostly wanted to how hard it would be and, as I said, haven’t used it much since. Yes, macFUSE is messy to rely upon.
    I feel as though the right abstraction is simply unavailable on macOS. Something akin to chroot jails — I don’t feel like I need a particularly hardened sandbox for agentic coding. I just need something that will prevent the stupid mistakes that are particularly damaging.
tuananh - 2 hours ago

isn't sandbox-exec already deprecated?
- e1g - 2 hours ago
  
  Yes, for about a decade. But it’s available everywhere, and still works - and protects us - like brand new!

alpb - 35 minutes ago

As I understand it, the problem nowadays doesn't seem to be so much that the agent is going to rm -rf / my host, it's more like it's going to connect to a production system that I'm authorized to on my machine or a database tool, and then it's going to run a potentially destructive command. There is a ton of value of running agents against production systems to troubleshoot things, but there are not enough guardrails to prevent destructive actions from the get-go. The solution seems to be specific to each system, and filesystem is just one aspect out of many.

crossroadsguy - 18 minutes ago

As I understand it, the problem is these apps/agents can do all of these and lot more (if not absolutely everything, while I am sure it can go quite close to doing that).
Solution could be two parts:
OS bringing better and easier to use OS limitations (more granular permissions; install time options and defaults which will be visible to user right there and user can reject that with choices like:
- “ask later”
- “no”
- “fuck no”
with eli5 level GUIs (and well documented). Hell, a lot of these are already solved for mobile OS. While not taking away tools away from hands of the user who wants to go inside and open things up (with clear intention and effort; without having to notarise some shit or pay someone).
2. Then apps[1] having to, forced to, adhere to use those or never getting installed.
[1] So no treating of agents as some “other” kinds of apps. Just limit it for every app (unless user explicitly decides to open things up).
It will also be a great time to nuke the despicable mess like Electron Helpers and shit and app devs considering it completely fine to install a trillion other “things” when user installed just one app without explaining it in the beginning (and hence forced to keep their apps’ tentacles simple and limited)

pash - 5 hours ago

Sandvault [0] (whose author is around here somewhere), is another approach that combines sandbox-exe with the grand daddy of system sandboxes, the Unix user system.

Basically, give an agent its own unprivileged user account (interacting with it via sudo, SSH, and shared directories), then add sandbox-exe on top for finer-grained control of access to system resources.

0. https://github.com/webcoyote/sandvault

hsaliak - an hour ago

This is a very nice and clean implementation. Related to this - I've been exploring injecting landlock and seccomp profiles directly into the elf binary, so that applications that are backed by some LLM, but want to 'do the right thing' can lock themselves out. This ships a custom process loader (that reads the .sandbox section) and applies the policies, not unlike bubblewrap which uses namespaces). The loading can be pushed to a kernel module in the future.

https://github.com/hsaliak/sacre_bleu very rough around the edges, but it works. In the past there were apps that either behaved well, or had malicious intent, but with these LLM backed apps, you are going to see apps that want to behave well, but cannot guarantee it. We are going to see a lot of experimentation in this space until the UX settles!

varenc - 3 hours ago

fun fact about `sandbox-exec`, the macOS util this relies on: Apple officially deprecated it in macOS Sierra back in 2016!

Its manpage has been saying it's deprecated for a decade now, yet we're continuing to find great uses for it. And the 'App Sandbox' replacement doesn't work at all for use cases like this where end users define their own sandbox rules. Hope Apple sees this usage and stops any plans to actually deprecate sandbox-exec. I recall a bunch of macOS internal services also rely on it.

jasomill - 2 hours ago

Aside from named profiles, I'm not sure it wasn't born deprecated.
In particular, has the profile language ever been documented by anything other than the examples used by the OS and third parties reverse engineering it?

davidcann - 4 hours ago

I made a native macOS app with a GUI for sandbox-exec, plus a network sandbox with per-domain filtering and secrets detection: https://multitui.com/

mkagenius - 5 hours ago

A way to run claude code inside a apple container -

  $ container system start

  $ container run -d --name myubuntu ubuntu:latest sleep infinity

  $ container exec myubuntu bash -c "apt-get update -qq && apt-get install -y openssh-server"

  $ container exec myubuntu bash -c "
    apt-get install -y curl &&
    curl -fsSL https://deb.nodesource.com/setup_lts.x |
  bash - &&
    apt-get install -y nodejs
  "

  $ container exec myubuntu npm install -g @anthropic-ai/claude-code

  $ container exec myubuntu claude --version

emmelaich - 2 hours ago

Thanks, hadn't heard of this! In homebrew, too.
https://github.com/apple/container

cuber_messenger - an hour ago

It's the exact auth control I want. However, it seems it's not a safehouse for local agents, but a safe cage, IMHO. After all, it prevents damage they might cause.

matifali - 3 hours ago

I wonder why you believe that running agents locally is the best approach. For most people, having agents operate remotely is more effective because the agent can stay active without your local machine needing to remain powered on and connected to the internet 24/7.

NegativeLatency - 3 hours ago

It’s nice having control and ownership of your software.
I’m assuming it’s similar to why people run plex, web servers, file sharing, etc
Also personally I’d rather not pay monthly fees for stuff if it can be avoided.
- mikodin - 2 hours ago
  
  Piggybacking on this - I think it well equips us for a future when local models are stronger. I for one am grateful for efforts like these
deevus - 2 hours ago

For this specific problem I built pixels: https://github.com/deevus/pixels
It supports running on a TrueNAS SCALE server, or via Incus (local or remote). I'm still working on tightening the security posture, but for many types of AI workflows it will be more than sufficient.

tl2do - 6 hours ago

Intriguing, but...

Around last summer (July–August 2025), I desperately needed a sandbox like this. I had multiple disasters with Claude Code and other early AI models. The worst was when Claude Code did a hard git revert to restore a single file, which wiped out ~1000 lines of development work across multiple files.

But now, as of March 2026, at least in my experience, agents have become more reliable. With proper guardrails in claude.md and built-in safety measures, I haven't had a major incident in about 3 months.

That said, layering multiple safeguards is always recommended—your software assets are your assets. I'd still recommend using something like this. But things are changing, bit by bit.

e1g - 6 hours ago

No doubt they are getting better, but even a 0.1% chance of “rm -rf” makes it a question of “when” not “if”. And we sure spin that roulette a lot these days. Safehouse makes that 0%, which is categorically different.
Also, I don’t want it to be even theoretically possible for some file in node_modules to inject instructions to send my dotfiles to China.
jeremyjh - 6 hours ago

Prompt injection attacks are very much a thing. It doesn't matter how good the agent is, its vulnerable, and you don't know what you don't know.
- ramoz - 5 hours ago
  
  Where are we at with SOTA or reliable prompt injection detection mechanisms?
bilalq - 5 hours ago

Look into git reflog. If the changes were committed, it was almost certainly possible to still restore them, even if the commit is no longer in your branch.
- ZYbCRq22HbJ2y7 - 4 hours ago
  
  There are probably other tools like this that keep version history based on filesystem events, independent from the project's git repository
  https://www.jetbrains.com/help/idea/local-history.html

sunir - 2 hours ago

Is clunker some new slang that's different than clanker? I'm asking for a friend of my friend Roku.

p.s. thanks for making this; timely as I am playing whackamole with sandboxing right now.

e1g - 2 hours ago

Testing in prod! Thank you, just fixed that typo.

srid - 4 hours ago

If you are using Nix, there's also https://github.com/srid/sandnix that works on Linux (landrun) and macOS (sandbox-exec).

Finbarr - 3 hours ago

Awesome to see a bash-only method of solving this problem. Also like that it alerts on attempts to read restricted stuff.

I built yolobox to solve this using docker/apple containers: https://github.com/finbarr/yolobox

devonkelley - 3 hours ago

Sandboxing solves "prevent the agent from doing damage." The failure mode it doesn't catch is when the agent operates perfectly within its permissions and still produces garbage because the model degraded or the tool stopped returning useful results.

That's a 200 OK the whole way down. "Prevent bad actions" and "detect wrong-but-permitted actions" are completely different problems.

synparb - 6 hours ago

I’ve been playing around with https://nono.sh/ , which adds a proxy to the sandbox piece to keep credentials out of the agent’s scope. It’s a little worrisome that everyone is playing catch up on this front and many of the builtin solutions aren’t good.

boxedemp - an hour ago

Fantastic! I had been using dockers but this might be better!

garganzol - 7 hours ago

While we have `sandbox-exec` in macOS, we still don't have a proper Docker for macOS. Instead, the current Docker runs on macOS as a Linux VM which is useful but only as a Linux machine goes.

Having real macOS Docker would solve the problem this project solves, and 1001 other problems.

mkagenius - 7 hours ago

Apple containers were released a few months back. Been using it to sandbox claude/gemini-cli generated code[1].
You can use it to completely sandbox claude code too.
1. Coderunner - https://github.com/instavm/coderunner
- arianvanp - 7 hours ago
  
  That is also Linux VM on MacOS. They're not MacOS containers.. So it's completely pointless / useless for MacOS or iOS development
  - mkagenius - 6 hours ago
    
    Oh, yes. I thought GP was mostly worried about shared VM problem.
dpe82 - 7 hours ago

Nitpick, which probably doesn't matter too much in this context but is always good to remember: Docker containers are not security boundaries.
- PlasmaPower - 6 hours ago
  
  Why not? They're definitely not perfect security boundaries, but neither are VMs. I think containers provide a reasonable security/usability tradeoff for a lot of use cases including agents. The primary concern is kernel vulnerabilities, but if you're keeping your kernel up-to-date it's still imo a good security layer. I definitely wouldn't intentionally run malware in it, but it requires an exploit in software with a lot of eyes on it to break out of.
- fredoliveira - 6 hours ago
  
  counter-intuitively, the fact that docker on the mac requires a linux-based VM makes it safer than it otherwise would be. But your point stands in general, of course.
PufPufPuf - 7 hours ago

What would native containers bring over Linux ones? The performance of VZ emulation is good, existing tools have great UX, and using a virtualized kernel is a bit safer anyways. I regularly use a Lima VM as a VSCode remote workspace to run yolo agents in.
- garganzol - 6 hours ago
  
  Sometimes you just have to run native software. In my case, that means macOS build agents using Xcode and Apple toolchains which are only available on macOS.
  It's not a pleasure to run them in a mutable environment where everything has a floating state as I do now. Native Docker for macOS would totally solve that.
- hirvi74 - 6 hours ago
  
  VZ has been exceptional for me. I have been running headless VMs with Lima and VZ for a while now with absolutely zero problems. I just mount a directory I want Claude Code to be able to see and nothing more.

inoki - 2 hours ago

I'm also working on a cross-platform solution (sandbox-exec on macOS). What if Apple finally drops this after long deprecation?

e1g - an hour ago

Let’s make something so popular and useful that they can’t drop it.

ashishb - 4 hours ago

I built something similar for myself that works on both Linux and Mac OS

https://github.com/ashishb/amazing-sandbox

wek - 2 hours ago

Do you have plans to go cross-platform and offer a solution for Windows?

treexs - 2 hours ago

wow it's interesting how noticeable sites built with claude maybe with the frotnend-design skill are now

e1g - 2 hours ago

IYKYK, it’s the new Bootstrap!
The alternative would be “no site”, which is still somehow worse.

dbmikus - 5 hours ago

I like that it's all bash.

How does this compare with Codex's and Claude's built-in sandboxing?

e1g - 5 hours ago

Claude: can escape its sandbox (there are GitHub issues about this) and, when sandboxed, still has full read access to everything on your machine (SSH keys, API keys, files, etc.)
Codex: IIRC, only shell commands are sandboxed; the actual agent runtime is not.
- dbmikus - 3 hours ago
  
  Cool, thanks for explaining!

gozucito - 7 hours ago

so this works the same as Claude Code /sandbox? The innovation being that it's harness-agnostic?

arianvanp - 6 hours ago

That and that the built in sandbox in Claude Code is bad (read only access to everything by default) and tightly coupled (cant modify it or swap it out).
e1g - 6 hours ago

Roughly, yes, but more reliable (and restrictive), as Claude Code has ways to escape its sandbox. This gives more protection and guards across all CLI agnets (Amp, Pi, etc)

vivid242 - 6 hours ago

Nice! I‘d be interesting in the things that went wrong during development. Which loopholes were discovered last, if any?

cjbarber - 4 hours ago

See also various sandbox tools I and others (e.g. jpeeler) have collected: https://news.ycombinator.com/item?id=47102258

nemo44x - 5 hours ago

Supervisor agent frameworks are going to be a big industry soon. You simply can’t have agents executing commands without a trusted supervisory layer examining and certifying actions.

All the issues we get from AI today (hallucinations, goal shift, context decay, etc) get amplified unbelievably fast once you begin scaling agents out due to cascading. The risk being you go to bed and when you wake up your entire infrastructure is gone lol.

openclaw01 - 34 minutes ago

[dead]

aplomb1026 - 4 hours ago

[dead]

moehj - 6 hours ago

[dead]

naomi_kynes - 6 hours ago

The "full-auto" framing is interesting. What happens when the agent hits something it can't resolve autonomously? Even sandboxed, there's a point where the agent needs to ask a question or get approval.

Most setups handle this awkwardly: fire a webhook, write to a log, hope the human is watching. The sandbox keeps the agent contained, but doesn't give it a clean "pause and ask" primitive. The agent either guesses (risky) or silently fails (frustrating).

Seems like there are two layers: the security boundary (sandbox-exec, containers, etc.) and the communication boundary (how does a contained agent reach the human?). This project nails the first. The second is still awkward for most setups.

- 4 hours ago

[deleted]
e1g - 6 hours ago

Correct, this is for skipping permissions (safely), but does nothing for skipping questions.

poopiokaka - 2 hours ago

[dead]

gnanagurusrgs - 5 hours ago

This is the right problem to solve. At Arcade, we see the same gap — agents get shell access, API keys, and network by default. The permissions model is backwards.

sandbox-profiles is a solid primitive for local agents. The missing piece in production is the tool layer — even a sandboxed agent can still make dangerous API calls if the MCP tools it has access to aren't individually authed and scoped.

The real stack is: sandbox the runtime (what Agent Safehouse does) + scope the tools (what we do with JIT OAuth at the MCP layer). Neither alone is enough.

Nice work shipping this.

https://www.arcade.dev/blog/ai-agent-auth-challenges-develop...