Instant database clones with PostgreSQL 18

boringsql.com

359 points by radimm 15 hours ago


elitan - 10 hours ago

For those who can't wait for PG18 or need full instance isolation: I built Velo, which does instant branching using ZFS snapshots instead of reflinks.

Works with any PG version today. Each branch is a fully isolated PostgreSQL container with its own port. ~2-5 seconds for a 100GB database.

https://github.com/elitan/velo

Main difference from PG18's approach: you get complete server isolation (useful for testing migrations, different PG configs, etc.) rather than databases sharing one instance.

sheepscreek - 6 hours ago

I set this up for my employer many years ago when they migrated to RDS. We kept bumping into issues on production migrations that would wreck things. I decided to do something about it.

The steps were basically:

1. Clone the AWS RDS db - or spin up a new instance from a fresh backup.

2. Get the arn and from that the cname or public IP.

3. Plug that into the DB connection in your app

4. Run the migration on pseudo prod.

This helped up catch many bugs that were specific to production db or data quirks and would never haven been caught locally or even in CI.

Then I created a simple ruby script to automate the above and threw it into our integrity checks before any deployment. Last I heard they were still using that script I wrote in 2016!

peterldowns - 8 hours ago

Really interesting article, I didn't know that the template cloning strategy was configurable. Huge fan of template cloning in general; I've used Neon to do it for "live" integration environments, and I have a golang project https://github.com/peterldowns/pgtestdb that uses templates to give you ~unit-test-speed integration tests that each get their own fully-schema-migrated Postgres database.

Back in the day (2013?) I worked at a startup where the resident Linux guru had set up "instant" staging environment databases with btrfs. Really cool to see the same idea show up over and over with slightly different implementations. Speed and ease of cloning/testing is a real advantage for Postgres and Sqlite, I wish it were possible to do similar things with Clickhouse, Mysql, etc.

riskable - 7 hours ago

PostgreSQL seems to have become the be-all, end-all SQL database that does everything and does it all well. And it's free!

I'm wondering why anyone would want to use anything else at this point (for SQL).

radarroark - 11 hours ago

In theory, a database that uses immutable data structures (the hash array mapped trie popularized by Clojure) could allow instant clones on any filesystem, not just ZFS/XFS, and allow instant clones of any subset of the data, not just the entire db. I say "in theory" but I actually built this already so it's not just a theory. I never understood why there aren't more HAMT based databases.

majodev - 10 hours ago

Uff, I had no idea that Postgres v15 introduced WAL_LOG and changed the defaults from FILE_COPY. For (parallel CI) test envs, it make so much sense to switch back to the FILE_COPY strategy ... and I previously actually relied on that behavior.

Raised an issue in my previous pet project for doing concurrent integration tests with real PostgreSQL DBs (https://github.com/allaboutapps/integresql) as well.

BenjaminFaal - 11 hours ago

For anyone looking for a simple GUI for local testing/development of Postgres based applications. I built a tool a few years ago that simplifies the process: https://github.com/BenjaminFaal/pgtt

christophilus - 10 hours ago

As an aside, I just jumped around and read a few articles. This entire blog looks excellent. I’m going to have to spend some time reading it. I didn’t know about Postgres’s range types.

francislavoie - 10 hours ago

Is anyone aware of something like this for MariaDB?

Something we've been trying to solve for a long time is having instant DB resets between acceptance tests (in CI or locally) back to our known fixture state, but right now it takes decently long (like half a second to a couple seconds, I haven't benchmarked it in a while) and that's by far the slowest thing in our tests.

I just want fast snapshotted resets/rewinds to a known DB state, but I need to be using MariaDB since it's what we use in production, we can't switch DB tech at this stage of the project, even though Postgres' grass looks greener.

1f97 - 12 hours ago

aws supports this as well: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

mvcosta91 - 12 hours ago

It looks very interesting for integration tests

TimH - 11 hours ago

Looks like it would probably be quite useful when setting up git worktrees, to get multiple claude code instances spun up a bit more easily.

wayeq - 5 hours ago

we just build the database, commit it to a container (without volumes attached), and programmatically stop and restart the container per test class (testcontainers.org). the overhead is < 5 seconds and our application recovers to the reset database state seamlessly. it's been awesome.

hmokiguess - 6 hours ago

I’ve been a fan of Neon and it’s branching strategy, really handy thing for stuff like this.

horse666 - 10 hours ago

This is really cool, looking forward to trying it out.

Obligatory mention of Neon (https://neon.com/) and Xata (https://xata.io/) which both support “instant” Postgres DB branching on Postgres versions prior to 18.

1a527dd5 - 11 hours ago

Many thanks, this solves integration tests for us!

eatsyourtacos - 3 hours ago

I still cannot reliably restore any Postgres DB with the TimescaleDB extensions on it.. have tried a million things but still fails every time.

oulipo2 - 10 hours ago

Assuming I'd like to replicate my production database for either staging, or to test migrations, etc,

and that most of my data is either:

- business entities (users, projects, etc)

- and "event data" (sent by devices, etc)

where most of the database size is in the latter category, and that I'm fine with "subsetting" those (eg getting only the last month's "event data")

what would be the best strategy to create a kind of "staging clone"? ideally I'd like to tell the database (logically, without locking it expressly): do as though my next operations only apply to items created/updated BEFORE "currentTimestamp", and then:

- copy all my business tables (any update to those after currentTimestamp would be ignored magically even if they happen during the copy) - copy a subset of my event data (same constraint)

what's the best way to do this?

tehlike - 6 hours ago

Now i need to find a way to migrate from hydra columnar to pg_lake variants so i can upgrade to PG18.

702318464 - 8 hours ago

[flagged]