I am building a cloud
crawshaw.io206 points by bumbledraven 3 hours ago
206 points by bumbledraven 3 hours ago
> Making Kubernetes good is inherently impossible, a project in putting (admittedly high quality) lipstick on a pig.
So well put, my good sir, this describes exactly my feelings with k8s. It always starts off all good with just managing a couple of containers to run your web app. Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.
After spending a lot of time "optimizing" or "hardening" the cluster, cloud spend has doubled or tripled. Incidents have also doubled or tripled, as has downtime. Debugging effort has doubled or tripled as well.
I ended up nuking the cluster, booted up a single VM with debian, enabled the firewall and used Kamal to deploy the app with docker. Despite having only a single VM rather than a cluster, things have never been more stable and reliable from an infrastructure point of view. Costs have plummeted as well, it's so much cheaper to run. It's also so much easier and more fun to debug.
And yes, a single VM really is fine, you can get REALLY big VMs which is fine for most business applications like we run. Most business applications only have hundreds to thousands of users. The cloud provider (Google in our case) manages hardware failures. In case we need to upgrade with downtime, we spin up a second VM next to it, provision it, and update the IP address in Cloudflare. Not even any need for a load balancer.
If you replaced k8s with a single app on a single VM then you’ve taken a hype fuelled circuitous route to where you should have been anyway.
> Agents, by making it easiest to write code, means there will be a lot more software. Economists would call this an instance of Jevons paradox. Each of us will write more programs, for fun and for work.
There is already so much software out there, which isn't used by anyone. Just take a look at any appstore. I don't understand why we are so obsessed with cranking out even more, whereas the obvious usecase for LLMs should be to write better software. Let's hope the focus shifts from code generation to something else. There are many ways LLMs can assist in writing better code.
> Let's hope the focus shifts from code generation to something else. There are many ways LLMs can assist in writing better code.
My view is actually the opposite. Software now belongs to cattle, not pet. We should use one-offs. We should use micro-scale snippets. Speaking language should be equivalent to programming. (I know, it's a bit of pipe dream)
In that sense, exe.dev (and tailscale) is a bit like pet-driven projects.
Sometimes “better” means “customized for my specific use case.” I expect that there will be a lot of custom software that never appears in any app store.
The amount of single purpose scripts in my ~/playground/ folder has increased dramatically over the past year. Super useful, wouldn’t have had the time for it otherwise, but not in any way shareable. Eg “parse this excel sheet I got from my obscure bank and upload it to my budgeting app’s REST API”. Wouldn’t have had the time or energy to do this before, now I have it and it scratches an itch.
This. Just today I added a full on shopping list system to our internal dashboard at work (small business) simply because it was slightly annoying and could be solved in 3 prompts and 15 minutes.
> I don't understand why we are so obsessed with cranking out even more... the obvious usecase for LLMs should be to write better software
I honestly think this is ideal. Video games aside, I think one day we'll look back and realize just how insane it was that we built software for millions or even billions of users to use. People can now finally build the software that does exactly what they've wanted their software to do without competing priorities and misaligned revenue models working against them. One could argue this kind of software, by definition, is higher quality.
Both will likely happen to some degree.
As for the average quality: it’s unclear.
My intuition is that agents lift up the floor to some degree, but at the same time will lead to more software being produced that’s of mediocre quality, with outliers of higher quality emerging at a higher rate than before.
Alas, we shifted from quality to quantity somewhere in the mid 19th century.
Potentially useful context: OP is one of the cofounders of Tailscale.
> Traditional Cloud 1.0 companies sell you a VM with a default of 3000 IOPS, while your laptop has 500k. Getting the defaults right (and the cost of those defaults right) requires careful thinking through the stack.
I wish them a lot of luck! I admire the vision and am definitely a target customer, I'm just afraid this goes the way things always go: start with great ideals, but as success grows, so must profit.
Cloud vendor pricing often isn't based on cost. Some services they lose money on, others they profit heavily from. These things are often carefully chosen: the type of costs that only go up when customers are heavily committed—bandwidth, NAT gateway, etc.
But I'm fairly certain OP knows this.
i was just curious so i tested this actually.
Using fio
Hetzner (cx23, 2vCPU, 4 GB) ~3900 IOPS (read/write) ~15.3 MB/s avg latency ~2.1 ms 99.9th percentile ≈ ~5 ms max ≈ ~7 ms
DigitalOcean (SFO1 / 2 GB RAM / 30 GB Disk) ~3900 IOPS (same!) ~15.7 MB/s (same!) avg latency ~2.1 ms (same!) 99.9th percentile ≈ ~18 ms max ≈ ~85 ms (!!)
using sequential dd
Hetzner: 1.9 GB/s DO: 850 MB/s
Using low end plan on both but this Hetzner is 4 euro and DO instance is $18.
> Cloud vendor pricing often isn't based on cost.
Business 101 teaches us that pricing isn't based on cost. Call it top down vs bottom up pricing, but the first principles "it costs me $X to make a widget, so 1.y * $X = sell the product for $Y is not how pricing works in practice.
Many cloud vendors have you pay through the nose for IOPS and bandwidth.
Edit: I posted this before reading, and these two are the same he points out.
Yes, but you can’t directly compare SAN-style storage with a local NVMe. But I agree that it’s too expensive, but not nearly as insane as the bandwidth pricing. If you go to a vendor and ask for a petabyte of storage, and it needs to be fully redundant, and you need the ability to take PIT-consistent multi-volume snapshots, be ready to pay up. And this is what’s being offered here.
And yes, IO typically happens in 4kb blocks, so you need a decent amount of IOPS to get the full bandwidth.
Nice post. exe.dev is a cool service that I enjoyed.
I agree there is opportunity in making LLM development flows smooth, paired with the flexibility of root-on-a-Linux-machine.
> Time and again I have said “this is the one” only to be betrayed by some half-assed, half-implemented, or half-thought-through abstraction. No thank you.
The irony is that this is my experience of Tailscale.
Finally, networking made easy. Oh god, why is my battery doing so poorly. Oh god, it's modified my firewall rules in a way that's incompatible with some other tool, and the bug tracker is silent. Now I have to understand their implementation, oh dear.
No thank you.
I find it difficult to configure Tailscale for my use case because they seem to completely not support making ACL rules based on the identity of the device rather than a part of the address space. I'm not configuring a router here, I'm configuring a peer-to-peer networking layer... or at least I'm supposed to be...
I remember from the docs you can use node names. At the very least you can use tags for sure. Assign tags to nodes and define the ACL based on those.
i just use Hetzner.
Everything which cloud companies provide just cost so much, my own postgres running with HA setup and backup cost me 1/10th the price of RDS or CloudSQL service running in production over 10 years with no downtime.
i directly autoscales instances off of the Metrics harvested from graphana it works fine for us, we've autoscaler configured via webhooks. Very simple and never failed us.
i don't know why would i even ever use GCP or AWS anymore.
All my services are fully HA and backup works like charm everyday.
Companies buy cloud services because they want to reduce in-house server management and operations, for them it's a trade-off with hiring the right people. But you are right, when you can find the right people doing it yourself can be a lot cheaper.
I get the feeling that with LLMs in the mix, in-house server management can do a lot more than it used to.
Perhaps it saves some time looking through the docs, but do you really trust an LLM to do the actual work?
Yes and an LLM checks it as well. I am yet to find a sysadmin task that an LLM couldn't solve neatly.
A nice bonus is that sysadmin tasks tend to be light in terms of token usage, that’s very convenient given the increasingly strict usage limits these days.
Agree, I used to always use Heroku or Render style platforms for my own software, but nowadays I just have a Linux server with Docker Compose and a Cron job. The cron job every minute runs docker pull (downloads latest image) and docker up -d (switches to new version only if there is a new version). And put caddy in front for the HTTPS. This has been very cheap and reliable for years now.
What images are you running that you'd need the latest version up after just a minute?
I'm not the OP but I'd clarify the cron check for new versions is done every minute. So when new images are pushed they're picked up quickly.
OP is not saying they push new versions at such a high frequency they need checks every one minute.
The choice of one minute vs 15 minute is implementation detail and when architected like this costs nothing.
I hope that helps. Again this is my own take.
Especially these days you can SSH to a baremetal server and just tell Claude to set up Postgres. Job done. You don't need autoscaling because you can afford a server that's 5X faster from the start.
You just use docker.
It is like 4 lines of config for Postgres, the only line you need to change is on which path Postgres should store the data.
You also probably want the Postgres storage on a different (set) of disks.
Maybe change the filesystem?
Honestly I like Hetzner a lot but lately it has been very unstable for us. https://status.hetzner.com/ this page always has couple of incidents happening at the same time. I really appreciate the services they provide but i wish they were more stable.
Because if I have a government service with millions of users, I don’t want the cheap shitter servers to crap out on me.
An employee is going to cost anywhere between 8k and 50k per month. Hiring an employee to save 200/month on servers by using a shitty VPS provider is not saving you any money.
If you have millions of users, you absolutely need to have someone whose whole job is managing servers. Expecting servers or cloud services to not crap out on you without someone with the skills and time to keep things running seems foolish.
I'm still new to cloud computing. I've only ever used linode. What is this supposed to be? I couldn't figure out a specific design through the article well. Pls help
> The standard price for a GB of egress from a cloud provider is 10x what you pay racking a server in a normal data center.
Oh, that’s too kind. More like 100x to 1000x. Raw bandwidth is cheap.
That's really cool!
One thing I'm confused with is how to create a shared resources like e.g. a redis server and connect to it from other vms? It looks now quite cumbersome to setup tailscale or connect via ssh between VMS. Also what about egress? My guess is that all traffic billed at 0.07$ per GB. It looks like this cloud is made to run statefull agents and personal isolated projects and distributed systems or horizontal scaling isn't a good fit for it?
Also I'm curious why not railway like billing per resource utilization pricing model? It’s very convenient and I would argue is made for agents era.
I did setup for my friends and family a railway project that spawns a vm with disk (statefull service) via a tg bot and runs an openclaw like agent - it costs me something like 2$ to run 9 vms like this.
That's insane funding so congrats.
Just shows I'm the Dropbox commentator. I have what exe provides on my own and am shocked by the value these abstractions provide everyone else!! One off containers on my own hardware spin up spin down run async agents, etc, tailscale auth, team can share or connect easily by name.
Investment is done by relationships, belief in a future vision and team, and growth metrics like number of paying customers.
The technology itself in its current form is not valuable
Sobering comment for all the little people like myself who dream of owning a business based on a vision of cool tech that just does what it promises (as opposed to all the corporate shovelware out there)
You can still do that. Not every business needs to be a hyperscaling startup.
Much respect for the ambitous plan, I wish I could do such bold thinking. I am running a small PHP PaaS (fortrabbit) for more than 10 years. For me, it's not only "scratch your own itch", but also "know your audience". So, a limited feature set with a high level of abstraction can also be useful for some users > clear path.
Hahaha! Have fun! I‘m doing the same - together with Claude Code. Since August. With https (mTLS1.3) everywhere, because i can. Just my money, just my servers, just for me. Just for fun. And what a fun it is!
Me too. I already moved our products to it and it is getting fairly robust. Guess many smaller companies got tired with the big guys asking a lot of money for things that should be cheap.
The "one price" is oddly small for a cloud company. I'm sure it's nice and fast but the $20/mo seems smaller than some companies' free tiers, especially for disk.
The main reason clouds offer network block devices is abstraction.
Don’t worry - that will certainly change in the future if they have any kind of success :)
I don't get it, what is this, how is it different?
As I understand, a cloud provider where instead of paying for each VM (with a set of resources), you pay for the resources, and can get as many VMs as you can fit on these resources.
Did... did you just scare Microsoft? They now announced a similar thing https://x.com/satyanadella/status/2047033636923568440
Congrats. Just checked your homepage. I love the fact you also show this comment
"That must be worst website ever made"
Made me love the site and style even more
I welcome the initiative but it’s pretty costly compared to the bare metal cloud providers. So the value as to be the platform as service too.
What will happen to my "Grandfathered Plan" I signed up to test it, don't recall if I gave you my credit card
exe.dev. 111 IN A 52.35.87.134
52.35.87.134 <- Amazon Technologies Inc. (AT-88-Z)
Their first location (PDX) is on Amazon I believe and not accepting new customers. They’ve said it’s much more expensive for them than the others. Their other locations are listed here:
Well yes, because they needed high availability and flexibility and tons of features…
Hey wait a minute!
With LLMs there is no real dev velocity penalty of using high perf. langs like say Rust. A pair of 192 Core AMD EPYC boxes will have enough headroom for 99.9% of projects.
Such statement is so off:
"In some tech circles, that is an unusual statement. (“In this house, we curse computers!”) I get it, computers can be really frustrating. But I like computers. I always have. It is really fun getting computers to do things. Painful, sure, but the results are worth it. Small microcontrollers are fun, desktops are fun, phones are fun, and servers are fun, whether racked in your basement or in a data center across the world. I like them all."
The reality: Everyone reading his blog or this HN entry loves computers.
Article doesn’t really tell what fundamental problems will be solved, except fancy VM allocation. Nothing about hardware, networking, reliability, tooling and such. Well, nice, good luck.
You can run several VM's or containers with isolation on your phone hardware, why even use the cloud when you just want to show your friends?
For me it’s so my coding agent keeps running when I close my laptop lid and it goes to sleep. VM in the cloud because I’m too lazy to set up a computer to be running as a server all the time.
How difficult is it to build a second startup on the side?
Why is an imperative SSH interface a better way of setting cloud resources than something like OpenTofu? In my experience humans and agents work better in declarative environments. If an OpenTofu integration is offered in the future, will exe.dev offer any value over existing cost-effective VPS providers like Hetzner? Technically, Hetzner, for example, also allows you to set up shared disk volumes:
https://github.com/hetzneronline/community-content/blob/mast...
It also has a CLI, hcloud. Am I getting any value with exe.dev I couldn't get with an 80 line hcloud wrapper?
I don't think SSH vs OpenTofu is the core issue here.
For agents, declarative plans are still valuable because they are reviewable. The interesting question is whether exe.dev changes the primitive: resource pools for many isolated VM-like processes, or just nicer VPS provisioning.
I know its a personal blog but the writing style is really full of himself. What a martyr, starting a second company.
It's hard to see the scale of what he's doing. Could be:
- I'm building a server farm in my homelab.
- I'm doing a small startup to see if this idea works.
- We're taking on AWS by being more cost effective. Funding secured.
[dead]
[dead]
[dead]
> 100 GB data transfer+
> $20 a month
2025 or 2005, what's the difference?