Stop Using JWTs
gist.github.com462 points by dzonga a day ago
462 points by dzonga a day ago
Necessary qualifier: for browser-based user sessions.
Plenty of good uses for JWTs for service-to-service communication.
edit: I read some of the linked stuff, e.g. https://paragonie.com/blog/2017/03/jwt-json-web-tokens-is-ba... . Please, if JWTs are such a horrifically insecure standard, go ahead and publish your means for hacking AWS STS's AssumeRoleWithWebIdentity , or don't publish and just exploit it by launching cryptominers in every Fortune 500 production AWS account. Let me know when you inevitably succeed, because JWTs are so insecure, right? /sarcasm
> Necessary qualifier: for browser-based user sessions.
> Plenty of good uses for JWTs for service-to-service communication.
This is the sensible conclusion right there. I agree JWTs are the wrong tool for the use case of user sessions in the browser.
To give some more arguments:
All the signature and encryption stuff in JWTs is complex. While common JWT libraries have now mostly got their stuff together, this has not always been the case. There were plenty of libraries accepting the "none" algorithm [1] or allowing attackers to forge tokens by using a public key as a shared secret [2]. This is the direct result of the complexity criticized in the linked blog post.
JWTs also cannot do some stuff you want for user sessions. You can't invalidate them without keeping a revocation list somewhere. But if you have to check an identifier for revocation on every request you could just use an opaque session ID and look that up on every request instead! Sure, you can use short-lived tokens and refresh them all the time, but why bother with that for a typical application that has to keep some state anyway?
All that being said, I wholeheartedly agree that there are use cases in distributed systems and machine-to-machine communication where signed tokens can be useful. Just please don't confuse the two cases.
[1] https://nvd.nist.gov/vuln/detail/cve-2022-23540
[2] https://nvd.nist.gov/vuln/detail/CVE-2024-54150 (just a random example from googling, I don't know what library made this one infamous)
> if you have to check an identifier for revocation on every request you could just use an opaque session ID and look that up on every request instead!
One reason could be the size. A revocation list only needs to keep session IDs of recently logged-out sessions, for which the token's TTL hasn't yet expired. It may be a much smaller list than a list of every active session.
Also, a JWT (or a Macaroon, etc) can store a large amount of details about the session in a cryptographically secure, unforgeable way. This rids you of the necessity to store all that in your active session database, again cutting the size.
As someone who operates a PostgreSQL database containing 27 billion SSL certificates, each 1-2kb each, with a bunch of secondary indexes that get inserted in random order, I find it pretty incredible that people see the need to optimize their session database. At what scale does the size of the session database actually matter?
Those stateless tokens may be "unforgeable", but they are replayable, and if you're not mindful of that you can have security vulnerabilities.
I think one meaningful case is when you have services in very different locations and you would rather than having to make a request to a session store in a single location, replicate the data to each location for better latency, so in this case a revocation list.
We have a single harbor instance at work, backed by PG. The folks ran into some sort of performance issue during their testing (not documented) and their solution was to use cloud native pg, run two postgres, and add additional second layer of connection pooling.
We eventually hit a bug in a combination of certain images, docker desktop (not even supported by us), and permissions. The error looked related to the connections, so I suggested the pooler be bypassed as a test. They ignored the suggestion for two weeks until it was the last thing to test.
Sure enough bypassing the pooler helped stabilize the connection from harbor during image uploads.
Now I need to talk to them about properly vertical scaling PG before we try and run two with replication, in the same kube cluster, in the same physical data center.
You should do some basic optimizations. Fixed length table and indexes on the unique string for fast lookups. I also like to do a rolling delete for old sessions after 30 days unless mobile session that is logged in. Those get to live forever.
Fair enough, but those optimizations are basically free. People think stateless tokens are free but they really are not.
The cost of the stateless token is basically the CPU usage for signing the message and checking the signature with the public key on the client. Example: Google Compute Instance asks metadata server for OIDC token (which is a JWT). The metadata server respond with the token that basically says "here's the machine service account, here's the machines ID, this token is proof that I am service account abc123 and it's valid for 20 seconds". This is one of the most common uses of JWTs in enterprise. You don't store them. They actually are free.
Lots of web devs get tricked into using them as primary session tokens and it's a huge anti pattern. I see it all the time and people get aggressive about it.
The cost is the vigilance required to use them safely. It's not just compute/storage costs.
I didn't downvote you. You're absolutely right. Implementation of anything is work.
> Fair enough, but those optimizations are basically free. People think stateless tokens are free but they really are not.
Strawman.
The only requirement for a JWT is posting the JSON Web Key set with the public keys used to verify the JWTs signature. That's the full cost of a no-frills JWT implementation of you exclude IAM.
If you want to have one-time JWTs you need to maintain a revocation list. This is literally a set of IDs. If you go nuts and use GUIs for JWT IDs that means each entry takes as much space as 4 ints, and all you need is a set membership check on said integer. Even at FANG scale you can handle that scale in a memory cache service such as ValKey running on a COTS desktop.
Now show us your alternative.
Yes we have heard this before, React is only 30kb! But that misses the enormous amount of infra you need to even just do a basic fetch. (Read the post by the React Query author on whether you need React Query or not)
Likewise with JWTs for sessions you need to handle cache invalidation, revocation lists, key rotation, the list of difficult comp sci problems really does go on!
The same issue as always plaguing the frontend world. Up front “simplicity”, enormous actual complexity
> Yes we have heard this before, React is only 30kb!
Not quite. You might be surprised to know, but the whole JOSE standard, and JWT in particular, specify a very limited set of fields. Whenever anyone starts requiring more than that, the responsibilities start to be offloaded to the likes of OpenID Connect.