Show HN: Webctl – Browser automation for agents based on CLI instead of MCP

github.com

101 points by cosinusalpha 21 hours ago


Hi HN, I built webctl because I was frustrated by the gap between curl and full browser automation frameworks like Playwright.

I initially built this to solve a personal headache: I wanted an AI agent to handle project management tasks on my company’s intranet. I needed it to persist cookies across sessions (to handle SSO) and then scrape a Kanban board.

Existing AI browser tools (like current MCP implementations) often force unsolicited data into the context window—dumping the full accessibility tree, console logs, and network errors whether you asked for them or not.

webctl is an attempt to solve this with a Unix-style CLI:

- Filter before context: You pipe the output to standard tools. webctl snapshot --interactive-only | head -n 20 means the LLM only sees exactly what I want it to see.

- Daemon Architecture: It runs a persistent background process. The goal is to keep the browser state (cookies/session) alive while you run discrete, stateless CLI commands.

- Semantic targeting: It uses ARIA roles (e.g., role=button name~="Submit") rather than fragile CSS selectors.

Disclaimer: The daemon logic for state persistence is still a bit experimental, but the architecture feels like the right direction for building local, token-efficient agents.

It’s basically "Playwright for the terminal."

kovac - an hour ago

I once built a tool(for an employer) that records browser sessions in a Selenium dialect. No LLMs at the time. Use case was to stop end users from coming to us to author scripts:

https://asciimx.com/log/bumblebee/

the_mitsuhiko - 12 hours ago

At this point I'm fully down the path of the agent just maintaining his own tools. I have a browser skill that continues to evolve as I use it. Beats every alternative I have tried so far.

binalpatel - 14 hours ago

Cool to see lots of people independently come to "CLIs are all you need". I'm still not sure if it's a short-term bandaid because agents are so good at terminal use or if it's part of a longer term trend but it's definitely felt much more seamless to me then MCPs.

(my one of many contribution https://github.com/caesarnine/binsmith)

Agent_Builder - 6 hours ago

Interesting approach. In our experience, most failures weren’t about which interface agents used, but about how much implicit authority they accumulated across steps. Control boundaries mattered more than the abstraction layer.

gregpr07 - 10 hours ago

Creator of Browser Use here, this is cool, really innovative approach with ARIA roles. One idea we have been playing around with a lot is just giving the LLM raw html and a really good way to traverse it - no heuristics, just BS4. Seems to work well, but much more expensive than the current prod ready [index]<div ... notation

TheTaytay - 7 hours ago

I really like this idea!

I’d like to see this other browser plugin’s API be exposed via your same CLI, so I don’t have to only control a separate browser instance. https://github.com/remorses/playwriter (I haven’t investigated enough to know how feasible it is, but as I was reading about your tool, I immediately wanted to control existing tabs from my main browser, rather than “just” a debug-driven separate browser instance.)

randito - 14 hours ago

If you look at Elixir keynote for Phoenix.new -- a cool agentic coding tool -- you'll see some hints about a browser control using a API tool call. It's called "web" in the video.

Video: https://youtu.be/ojL_VHc4gLk?t=2132

More discussion: https://simonwillison.net/2025/Jun/23/phoenix-new/

renegat0x0 - 14 hours ago

A little bit different, but also allows to scrape efficiently. Json http communication rather than cli.

https://github.com/rumca-js/crawler-buddy

More like a framework for other mechanisms

philipbjorge - 15 hours ago

This looks remarkably similar to https://github.com/vercel-labs/agent-browser

How is it different?

grigio - 14 hours ago

is there a benchmark? there are a lot of scraping agents nowdays..

desireco42 - 13 hours ago

How are you holding session if every command is issues through cli? I assume this is essential for automation.

- 17 hours ago
[deleted]