Why Strong Consistency?
brooker.co.za65 points by SchwKatze a day ago
65 points by SchwKatze a day ago
I continue to be surprised that in these discussions correctness is treated as some optional highest possible level of quality, not the only reasonable state.
Suppose we're talking about multiplayer game networking, where the central store receives torrents of UDP packets and it is assumed that like half of them will never arrive. It doesn't make sense to view this as "we don't care about the player's actual position". We do. The system just has tolerances for how often the updates must be communicated successfully. Lost packets do not make the system incorrect.
If you’re live streaming video, you can make sure every frame is a P-frame which brings your bandwidth costs to a minimum, but then a lost packet completely permanently disables the stream. Or you periodically refresh the stream with I-frames sent over a reliable channel so that lost packets corrupt the video going forward only momentarily.
Sure, if performance characteristics were the same, people would go for strong consistency. The reason many different consistency models are defined is that there’s different tradeoffs that are preferable to a given problem domain with specific business requirements.
If the video is streaming, people don't really care if a few frames drop, hell, most won't notice.
It's only when several frames in a row are dropped that people start to notice, and even then they rarely care as long as the message within the video has enough data points for them to make an (educated) guess.
P/B frames (which is usually most of them) reference other frames to compress motion effectively. So losing a packet doesn't mean a dropped frame, it means corruption that lasts until the next I-frame/slice. This can be seconds. If you've ever seen corrupt video that seems to "smear" wrong colors, etc. across the screen for a bunch of frames, that's what we're talking about here.
Again - the viewer rarely cares when that happens
Minor annoyance, maybe, rage quit the application? Not a chance.
If you’re never sending an I-frame then it’s permanently corrupt. Sending an I-frame is the equivalent of eventual consistency.
If the area affected literally doesn't change for minutes afterwards it will not get refreshed and fixed.
Okay but now you're explaining that correctness is not necessarily the only reasonable state. It's possible to sacrifice some degree of correctness for enormous gains in performance because having absolute correctness comes at a cost that might simply not be worth it.
Back in the day there were some P2P RTS games that just sent duplicates. Like each UDP packet would have a new game state and then 1 or more repetitions of previous ones. For lockstep P2P engines, the state that needs to be transferred tends towards just being the client's input, so it's tiny, just a handful of bytes. Makes more sense to just duplicate ahead of time vs ack/nack and resend.
I think we should stop calling these systems eventually consistent. They are actually never consistent. If the system is complex enough and there are always incoming changes, there is never a point in time in these "eventually consistent systems" that they are in consistent state. The problem of inconsistency is pushed to the users of the data.
The way you're defining "eventually consistent" seems to imply it means "the current state of the system is eventually consistent," which is not what I think that means. Rather, it means "for any given previous state of the system, the current state will eventually reflect that."
"Eventually consistent," as I understand it, always implies a lag, whereas the way you're using it seems to imply that at some point there is no lag.
Someone else stated this implicitly, but with your reasoning no complex system is ever consistent with ongoing changes. From the perspective of one of many concurrent writers outside of the database there’s no consistency they observe. Within the database there could be pending writes in flight that haven’t been persisted yet.
That’s why these consistency models are defined from the perspective of “if you did no more writes after write X, what happens”.
Changing terminology is hard once a name sticks. But yeah, "eventual propagation" is probably more accurate. I do get the impression that "eventual consistency" often just means "does not have a well-defined consistency model".
They eventually become consistent from the frame of a single write. They would become consistent if you stopped writes, so they will eventually get there
> They are actually never consistent
I don't see it this way. Let's take a simple example - banks. Your employer sends you the salary from another bank. The transfer is (I'd say) eventually consistent - at some point, you WILL get the money. So how it can be "never consistent"?
Because I will have spent it before it becomes available :)
For the record (IMO) banks are an EXCELLENT example of eventually consistent systems.
They're also EXCELLENT for demonstrating Event Sourcing (Bank statements, which are really projections of the banks internal Event log, but enough people have encountered them in such a way that that most people understand them)
If the bank transaction is eventually consistent, it means that the state can flip and the person receiving will "never" be sure. A state that the transaction will be finished later is a consistent state.
Just like Git. Why bother with all these branches, commits and merges?
Just make it so everyone's revision steps forward in perfect lockstep.
> If the system is complex enough and there are always incoming changes
You literally don't understand the definition of eventual consistency. The weakest form of eventual consistency, quiescent consistency, requires [0]:
that in any execution where the updates stop at
some point (i.e. where there are only finitely many updates), there
must exist some state, such that each session converges to that state
(i.e. all but finitely many operations e in each session [f] see that state).
Emphasis on the "updates stop[ping] at some point," or there being only "finitely many updates." By positing that there are always incoming changes you already fail to satisfy the hypothesis of the definition.In this model all other forms of eventual consistency exhibit at least this property of quiescent consistency (and possibly more).
[0] https://www.microsoft.com/en-us/research/wp-content/uploads/...
The GP proposed that the definition should be changed. That in no way implies a lack of understanding of the present definition.
I don't understand this article and It's like the author doesn't really know what they're talking about. They don't want eventual consistency, they want read-your-writes, a consistency level that's stronger than EC yet still not strong.
https://jepsen.io/consistency/models/read-your-writes
Read-your-writes is indeed useful because it makes code easier to write: every process can behave as if it was the only one in the world, devs can write synchronous code, that's great ! But you don't need strong consistency.
I hope developers learn a little bit more about the domain before going to strong consistency.
I am not an expert, but from the examples in the article I think the author is looking for a bit more than read-your-writes.
E.g. They mention reading a list of attachements and want to ensure they get all currently created attachements, which includes the ones created by other processes.
So they want to have "read-all-writes" or something like that.
> read-modify-write is the canonical transactional workload. That applies to explicit transactions (anything that does an UPDATE or SELECT followed by a write in a transaction), but also things that do implicit transactions (like the example above)
Your "implicit transaction" would not be consistent even if there was no replication involved at all. Explicit db transactions exist for a reason - use them.
It's wishful thinking. It's like choosing Newtonian physics over relativity because it's simpler or the equations are neater.
If you have strong consistency, then you have at best availability xor partition tolerance.
"Eventual" consistency is the best tradeoff we have for an AP system.
Computation happens at a time and a place. Your frontend is not the same computer as your backend service, or your database, or your cloud providers, or your partners.
So you can insist on full-ACID on your DB (which it probably isn't running btw - search "READ COMMITTED".) but your DB will only be consistent with itself.
We always talk about multiple bank accounts in these consistency modelling exercises. Do yourself a favour and start thinking about multiple banks.
I keep wondering how the recent 15h outage have affected these eventually consistent systems.
I really hope to see a paper on the effects of it.
The argument seems to rely on the point that the replicas are only valuable if you can send reads to them, which I don't think is true. Eventually-consistent replicated databases are valuable on their own terms even if you can only send traffic to the leader.
in the read after write scenario, why not use something like consistency tokens ? and redirect to primary if the secondary detects it has not caught up ?
Blogs like this make me go on the same rant for the n-th time:
Consistency for distributed systems is impossible without APIs returning cookies containing vector clocks.
The idea is simple: every database has a logical sequence number (LSN), which the replicas try to catch up to -- but may be a little bit behind. Every time an API talks to a set of databases (or their replicas) to produce a JSON response (or whatever), it ought to return the LSNs of each database that produced the query in a cookie. Something like "db1:591284;db2:10697438".
Client software must then union this with their existing cookie, and return the result of that to the next API call.
That way if they've just inserted some value into db1 and the read-after-write query ends up going to a read replica that's slightly behind the write master (LSN 591280 instead of 591284) then the replica can either wait until it sees LSN >= 591284, or it can proxy the query back to the write master. A simple "expected latency of waiting vs proxying" heuristic can be used for this decision.
That's (almost entirely) all you need for read-after-write transactional consistency at every layer, even through Redis caches and stateless APIs layers!
FWIW, I think that’s essentially how Aurora DSQL works, and sort of explained at the end of the article.
For the love of all that’s holy, please stop doing read-after-write. In nearly all cases, it isn’t needed. The only cases I can think of are if you need a DB-generated value (so, DATETIME or UUIDv1) from MySQL, or you did a multi-row INSERT in a concurrent environment.
For MySQL, you can get the first auto-incrementing integer created from your INSERT from the cursor. If you only inserted one row, congratulations, there’s your PK. If you inserted multiple rows, you could also get the number of rows inserted and add that to get the range, but there’s no guarantee that it wasn’t interleaved with other statements. Anything else you wrote, you should already have, because you wrote it.
For MariaDB, SQLite, and Postgres, you can just use the RETURNING clause and get back the entire row with your INSERT, or specific columns.
> please stop doing read-after-write
But that could be applied only in context of a single function. What if I save a resource and then mash F5 in the browser to see what was saved? I could hit a read replica that wasn't fast enough and the consistency promise breaks. I don't know how to solve it.
Yep. Your SQL transactions are only consistent to the extent that they stay in the db.
Mashing F5 is a perfect example of stepping outside the bounds of consistency.
If want to update a counter, do you read the number on your frontend, add 2 then send it back to the backend? If someone else does the same, that's a lost write regardless of how "strongly consistent" your db vendor promises to be.
But that's how the article says programmers work. Read, update, write.
If you thought "that's dumb, just send in (+2)", congrats, that's EC thinking!
Local storage, sticky sessions, consistent hashing cache
I think the point is that read-after-write is exactly the desired property here.
So why isn't the section that needs consistency enclosed in a transaction, with all operations between BEGIN TRANSACTION and COMMIT TRANSACTION? That's the standard way to get strong consistency in SQL. It's fully supported in MySQL, at least for InnoDB. You have to talk to the master, not a read slave, when updating, but that's normal.