How/why to sweep async tasks under a Postgres table

taylor.town

33 points by ostler 5 hours ago


koolba - 3 hours ago

The article says:

> Never Handroll Your Own Two-Phase Commit

And then buried at the end:

> A few notable features of this snippet:

> Limiting number of retries makes the code more complicated, but definitely worth it for user-facing side-effects like emails.

This isn't two-phase commit. This is lock the DB indefinitely while remote system is processing and pray we don't crash saving the transaction after it completes. That locked also eats up a database connection so your concurrency is limited by the size of your DB pool.

More importantly, if the email sends but the transaction to update the task status fails, it will try again. And again. Forever. If you're going to track retries it would have to be before you start the attempt. Otherwise the "update the attempts count" logic itself could fail and lead to more retries.

The real answer to all this is to use a provider that supports idempotency keys. Then when you can retry the action repeatedly without it actually happening again. My favorite article on this subject: https://brandur.org/idempotency-keys

rictic - 3 hours ago

Missing from the article: how to communicate progress and failure to the user?

This is much more complicated with task queues. Doable still! But often skipped, because it's tempting to imagine that the backend will just handle the failure by retrying. But there are lots of kinds of failure that can happen.

The recipient's server doesn't accept the email. The recipient's domain name expired. Actually, we don't have an email address for that recipient at all.

The user has seen "got it, will do, don't worry about it" but if that email is time sensitive, they might want to know that it hasn't been sent yet, and maybe they should place a phone call instead.

nullzzz - 2 hours ago

I can recommend this architecture. So much easier to maintain and understand than using an extra service. The implementation here I didn’t go into much detail, but you can surely roll your own if this doesn’t cut it for you, or use a library like pgboss.

efxhoy - 2 hours ago

I like it! We have a service with a similar postgres task queue but we use an insert trigger on the tasks table that does NOTIFY and the worker runs LISTEN, it feels a bit tidier than polling IMO.

stack_framer - 3 hours ago

> I like slim and stupid servers, where each endpoint wraps a very dumb DB query.

I thought I was alone in this, but I build all my personal projects this way! I wish I could use this approach at work, but too many colleagues crave "engineering."

rgbrgb - 3 hours ago

If you're in TS/JS land, I like to use an open source version of this called graphile-worker [0].

[0]: https://worker.graphile.org

morshu9001 - 3 hours ago

Never do RPCs during an xact like this! Fastest way to lock up your DB. I don't even mean at large scale. I've been forced many times to set up two-phase commit. That way you also get more flexibility and visibility into what it's doing.