Launch HN: BrowserBook (YC F24) – IDE for deterministic browser automation
54 points by cschlaepfer 7 hours ago
54 points by cschlaepfer 7 hours ago
Hey HN! We’re Chris, Jorrie, and Evan of BrowserBook, an IDE for writing and debugging Playwright-based web automations. You can download it as a Mac app here: https://browserbook.com, and there’s a demo video at https://www.youtube.com/watch?v=ODGJBCNqGUI.
Why we built this: When we were going through YC, we were a company that automated back-office healthcare workflows. Since the interoperability ecosystem in healthcare is so fragmented, we started using browser agents to automate EMRs, practice management software, and payment portals directly through the web. When we did, we ran into a ton of problems:
Speed: High latency on LLM calls vs. a scripting approach
Cost: We burned through tokens with all the context we needed to make the automations reasonably accurate
Reliability: Even with detailed instructions, context, and tools, agents tended to drift on multi-step tasks in unpredictable ways
Debuggability: When drift did occur, we were essentially playing whack-a-mole in our prompt and re-running the whole automation to debug issues (see above: speed and cost issues made this quite painful)
More and more we were just giving our agent scripts to execute. Eventually, we came to the conclusion that scripting is a better approach for web automation for these sort of use cases. But scripting was also too painful, so we set out to solve those problems with BrowserBook.
Under the hood, it runs a standalone TypeScript REPL wired directly into an inline browser instance, with built-in tooling to make script development quick and easy. This includes:
- A fully interactive browser window directly in the IDE so you can run your code without context switching
- A Jupyter-notebook-style environment - the idea here is you can write portions of your automation in individual cells and run them individually (and quickly reset manually in the browser), instead of having to rerun the whole thing every time
- An AI coding assistant which uses the DOM context of the current page to write automation logic, which helps avoid digging around for selectors
- Helper functions for taking screenshots, data extraction, and managed authentication for auth-required workflows.
Once you’ve created your automation, you can run it directly in the application or in our hosted environment via API, so you can use it in external apps or agentic workflows.
At its core, BrowserBook is an Electron app, so we can run a Chrome instance directly in the app without the need for cloud-hosted browsers. For API runs, we use hosted browser infra via Kernel (which is a fantastic product, btw), relying on their bot anti-detection capabilities (stealth mode, proxies, etc.).
Scripted automation can be unpopular because scripts are inherently brittle; unlike “traditional” software development, your code is deployed in an environment you don’t control - someone else’s website. With BrowserBook, we’re trying to “embrace the suck”, and acknowledge this “offensive programming” environment.
We’ve designed from the ground up to assume scripts will break, and aim to provide the tools that make building and maintaining them easier. In the future, our plan is to leverage AI where it has shown its strength already - writing code - to minimize downtime and quickly repair broken scripts as the deployed environment changes.
Browser agents promised to solve this by handing the reins to an LLM which can handle inconsistency and ambiguity. While we think there are some applications where browser agents can be genuinely helpful, tasks that need to be done reliably and repeatedly are not one of them.
We’d love for you to try it out! You can download BrowserBook from our website here: https://browserbook.com (only available for Mac so far, sorry!) And of course, we’d appreciate any feedback and comments you have!
Awesome product! I really liked how the auth profiles work as well. While the primary use case is workflow automation are there any roadmap items on integrating this with the developer experience? A previous company I was at was fairly fond of e2e tests in playwright and this seems like it would have been a huge boon for writing them quickly. Thanks! We're constantly thinking about ways we can improve the dev experience and integrations story around deploying these scripts. Right now we support API executions, and we are adding webhooks soon. We think this will unblock the earliest adopters, and as we learn more about popular use cases/workflows, we'll look to prioritize first-class integrations where it makes sense. Interesting. Quick question in regards to the code generation : Do you dump the DOM to provide relevant context to build the automation or does the agent automatically tries to discover relevant segments (like a claude code) ? Edit : Answered in the video, dump of a simplified version of the DOM. How is the discovery of the rest is performed ? Super nice, can really see the use cases, even for security testing. Appreciate it! Yes exactly - today we send it a simplified version of the DOM, but we're currently building an agent which will be able to discover the relevant DOM elements. Congrats on the launch Please port to linux soon (sure it's relatively trivial on Electron :)). Like the idea of the IDE. Seems like it'd make it easy to prototype and launch quickly. RE: embrace the suck, yeah I'm with you. I prefer the brittleness of scripts to non-deterministic (potentially unhinged) workflows Thanks! Yep, linux is coming soon - now that we have the first version of the IDE out the door we're going to get cross-platform going shortly. What UI framework are you using to build the app? I've been hacking together my own browser automations... the idea of deterministic scripting resonates with me... but I'm wondering how BrowserBook plans to handle authentication flows that require 2FA or CAPTCHAs. ALSO is there any plan for integrating with CI pipelines... being able to run these scripts headless on servers would be huge. BUT overall it's refreshing to see someone lean into brittle scripts rather than hide behind agent magic... BrowserBook allows users to create 'auth profiles' which can be utilized in notebooks for authentication purposes. These profiles currently support username/password and 2FA via TOTP (and we recommend provisioning a service account for your automations). For captchas, we use Kernel's stealth mode which includes a captcha solver. Re: CI integration, today we support API-based execution, but if you have a specific CI pipeline or set of tools you'd like to see support for, let us know and we can look into it! Congrats! Could this also be used to generate e2e test automations? For scraping, how do you handle Cloudflare and Captchas? Do you respect robots.txt instructions of websites? Thanks, we appreciate it! Yes, you can use BrowserBook to write e2e test automations, but we don't currently include playwright assertions in the runtime - we excluded these since they are geared toward a specific use case, and we wanted to build more generally. Let us know if you think we should include this though; we're always looking for feedback. > For scraping, how do you handle Cloudflare and Captchas? Cloudflare turnstiles/captchas tend to be less of an issue in the inline browser because it’s just a local Chrome instance, avoiding the usual bot-detection flags from headless or cloud browsers (datacenter IPs, user-agent quirks, etc.). For hosted browsers, we use Kernel's stealth mode to similar effect. > Do you respect robots.txt instructions of websites? We leave this up to the developer creating the automations. This is a super interesting product, guys. I get that agents aren't great for everything right now, but I'd expect that they'll continue to improve over time (like everything in the LLM space). How do you see the product evolving as agents become better and better? Thanks, and great question - we think about this a lot and think there are a couple of things here. First, as models get better, our agent's ability to navigate a website and generate accurate automation scripts will improve, giving us the ability to more confidently perform multi-step generations and get better at one-shotting automations. We expect browser agents will improve as well, which I think is more along the lines of what you're asking. At scale, we still think scripts will be better for their cost, performance, and debuggability aspects - but there are places where we think browser agents could potentially fit as an add-on to deterministic workflows (e.g., handling inconsistent elements like pop-ups or modals). That said, if we do end up introducing a browser agent in the execution runtime, we want to be very opinionated about how it can be used, since our product is primarily focused on deterministic scripting. > At scale, we still think scripts will be better for their cost, performance, and debuggability aspects This actually makes a ton of sense to me in lots of the LLM contexts (e.g. seeing how we are starting to prefer having LLMs write one-off scripts to do API calls rather than just pointing them at problems and having them try it directly). Thanks! Agents have plateaued and will never be as good as deterministic code for browser automation. It's a fundamental issue with the way LLMs work. I feel like I've seen a product similar to this quite recently on HN, but it was a standalone agentic workflow which was open source on GitHub. I can't seem to find it right now. Does anyone know what I'm referring to? There have been a bunch https://hn.algolia.com/?dateRange=pastYear&page=0&prefix=fal... Wow, really cool project. As someone who's not primarily a frontend developer but has had to write a lot of browser-based feature tests, I love the concept and execution. Why the subscription model though? That's the one thing that concerns me. Is data being sent back to your servers to enable some of the functionality? I don't speak for my employer here, but as someone who works in the healthcare technology industry, if I wanted to get my bosses to buy into this, I would be looking for something that we license to run on our environments. Thank you! > Why the subscription model though? The subscription model is primarily to cover the costs of creating and running automations at scale (i.e., LLM code gen and browser uptime) and to build a sustainable business around those features. We included the free tier to give users access to the IDE, but we're committed to adding value beyond just the IDE and subscriptions support that. > Is data being sent back to your servers to enable some of the functionality? Yes - we save all notebooks in our database. Since we're working to build a lot of value-add features for hosted executions, having notebooks saved online worked in service of that. That said, we're now thinking about the local-only / no sign-up use case as well. We've gotten a lot of feedback about this, so it's something we're taking seriously now that we've gotten all of the core functionality in. I will also add, we are HIPAA-compliant for healthcare use cases. Really appreciate the questions! I'm curious why use a hosted browser instead of just spinning one up locally and since you already have he electron app. Why not just use a different Chrome profile for isolation and interact with that? Thanks for the question! We only use the hosted browser for running the automations remotely (via API). In the IDE, we use a local chrome browser, where we spin up an anonymous profile for isolation. I like this and find it profoundly weird in equal parts. I get the use case and why it’s being done in “the browser”, but like RPA and similar tech, I have to wonder at the path that the industry took to get here and make this a viable and clever solution. Somewhere there is a timeline where front-ends evolved differently. > only available for Mac so far, sorry! Is there a plan to change this? Building on Electron should make it manageable to go cross-platform. 2026 will be the year of the Linux desktop, as the prophecies have long foretold. Off-topic, but Kernel refers to https://www.onkernel.com. A bit of an awkward name Yes! Going cross-platform is on the roadmap, but right now we're focused on building out the feature set and improving the form factor. Also yeah, Kernel refers to onkernel - they call it that because they're running the browsers in a unikernel. They're a great product and you should give them a look if you need hosted browsers! Why would something be only available for Mac? What are they using? In the early stage of a startup/product the most important thing is usually fast development loops, to add features and, most importantly, figure out what the product needs to be. It usually makes sense to focus on that, which means deferring later projects (such as porting to other platforms) which are also important, but aren't part of the figuring-out-the-product loop and so can come later. This of course is frustrating for users who want the product but aren't using that platform. But also a good sign that someone wants the product enough to complain about this!
nickstaggs - 3 hours ago
cschlaepfer - 2 hours ago
doomerhunter - 6 hours ago
cschlaepfer - 6 hours ago
jackconsidine - 6 hours ago
cschlaepfer - 6 hours ago
dman - 3 hours ago
orliesaurus - 3 hours ago
cschlaepfer - 2 hours ago
jackienotchan - 3 hours ago
cschlaepfer - 2 hours ago
huntaub - 6 hours ago
cschlaepfer - 6 hours ago
huntaub - 5 hours ago
smt88 - 6 hours ago
poly2it - 6 hours ago
jasonjmcghee - 6 hours ago
devmor - 5 hours ago
cschlaepfer - 4 hours ago
innagadadavida - 2 hours ago
cschlaepfer - 2 hours ago
rcarmo - 3 hours ago
timerol - 6 hours ago
jorrie - 6 hours ago
koakuma-chan - 6 hours ago
dang - an hour ago