In praise of memcached

249 points by j03b 17 hours ago

There's nothing about memcache that avoids these problems.

Back in the mid-2000s I worked on a scaled system that used memcache, and developers fell victim to all the exact same problems that are cited with Redis in the article.

- Developers attempting to endrun every one of the laws of distributed systems by using memcache.

- We had cache addiction, so the fleets got sized on the assumption that memcache was up, and then memecache had a problem, and suddenly we were DDOSed.

- write amplification, where one host would nuke a high-TPS key and every other host would DDOS a dependency to repopulate the key.

- hot keys which led to hot hosts and because we cohosted memcached with the service daemons, it meant mystery CPU spikes.

- stickiness from stale DNS entries causing memcache calls to blackhole.

Every single one was avoidable by using memcache in a better way, but the temptation to abuse it was too strong.

downsplat - 8 hours ago

Redis is a great piece of tech but it suffers from trying to be good at two different jobs (persistent data structures, volatile cache) which should not be combined. And indeed in Redis itself they don't combine well - persistence is globally on or off.

Personally I'd use memcached or some equivalent for strictly cacheing, and then bring on Redis with persistence if you need its data structures for e.g scoreboards.

At $WORK we never imported either, our cache layer for slow operations keeps its data in both the filesystem and a db table (used as a k/v store). The database helps coordinate thundering herd problems - this operation is being calculated by another thread, so just wait for it. Reads from the same server just hit the filesystem, and reads from another server hit the db once and then keep it in the filesystem. We could change the fs layer to memcached but so far it's working great.

roryirvine - 6 hours ago

I had some experience wrangling Memcachedb (memcache + bdb for persistence) in the late 2000s, and came to much the same conclusion.
Redis was definitely more featureful, and antirez is both an engaging character and admirably humble, so I can see why redis overtook it in popularity - but, for me, memcached has always been the pinnacle of "choose boring technology".
As a platform engineer, I'm happy to support either - but when developers start using some of the more advanced redis features (persistence, replication, clustering), I try to make sure that they've fully understood the downsides of that decision.
0xCAP - 7 hours ago

> We could change the fs layer to memcached but so far it's working great.
This so much. (Ab)Using a db table as a k/v store + the FS can do so much before even considering paying the price of setting up a dedicated caching store. I’ve fought countless foes in the engineering world when proposing solutions like yours just because (incompetent) people feel like caching should live in its dedicated store.
- ritcgab - 4 hours ago
  
  Because a database is a kv store. Most workloads won't tell the performance difference as long as the store works.

kylewpppd - 15 hours ago

I think I've seen all of the Redis/Valkey issues the author mentioned in production.

* Outages where Valkey had no memory policy, ate all the memory, and then caused write errors to its append-only file. Bonus points for another one where the disk itself was full, and AOF writes failed.

* 500s where Redis was fully expected to be live, running, and populated with data for every user, and no fallback to a slower path.

* Creative uses of sorted sets and other data structures which depended on the sets never being evicted.

Despite the observations from the field, I think it's still hard to recommend memcache ahead of Redis. It can be difficult to architect an app to have a memcache-friendly cache layout.

I'd almost guarantee a large enough team using memcache will find a way to need Redis. And then we're maintaining 2 cache technologies.

calpaterson - 8 hours ago

Once someone decides they want to use redis as something other than a cache, you sort of do have 2 cache technologies anyway. You can't use a redis instance that is configured for caching for any other purpose (caching instance must have eviction, non-caching instance must not have eviction). You need a second redis with a different configuration.
Honestly designing your app to have a "memcache-friedly cache layout" is the same thing as designing it to have a redis-friendly cache layout. The pattern for this kind of application cache is identical: "get, and if not there, calculate and set".
- tracker1 - 3 hours ago
  
  Redis can also cache sets with correlated sub-data as part of the model/eviction pattern.. this can give you trends beyond simple k/v
tracker1 - 3 hours ago
I tend to write an abstraction interface, if there isn't already one, where you request a key and pass an async function/lambda that will return the value from source in case of a cache miss.
```
    var value = cache.lookup<T>(
      keyname, 
      () => db.query<T>(...), 
      TimeSpan.FromMinutes(5) // or CacheOptions
    );
```
This way it can fallback/insert on a cache miss directly...
teacpde - 12 hours ago

Not maintaining 2 cache technologies is always a winning argument.

nasretdinov - 12 hours ago

One other feature of memcache that is rarely mentioned is that all operations are O(1) by design, which is a conscious design choice from the authors: yes, it is limiting, but it also ensures no random stalls on simple operations, whereas Redis with its single-threaded core design can't guarantee that since you can run operations of arbitrary complexity (which surely as a developer make you feel very smart about it) and everything else will be waiting for them to complete

rnio - 2 hours ago

[flagged]

jdw64 - 11 hours ago

This kind of thing tends to happen a lot with open source projects or programs that are maintained long term. As the codebase grows, it inevitably starts supporting things that weren't part of the original plan.

More features mean more users. Some stick to the old stuff, some embrace the new, and eventually certain values become the de facto default, not really optional anymore.

Take Redis. Turn off AOF and it works as a volatile in memory cache. But most of us don't even think about it that way. So there is this argument that fewer features and simpler is better.(Memcached is such an example in this context) The so called 'straitjacket' approach. That makes total sense for big teams. But on the other hand, open source projects need regular updates to keep getting funding or contributions, so there is a built in tension.

And sometimes that leads to specialized forks or spin offs that excel in one niche area. My personal take? There is no right answer. It all depends on the context. Communication itself isn't free, after all

stuaxo - 9 hours ago

"Communication itself isn't free, after all"
Off topic, but that's my problem with microservices, devs seem to be totally unaware of this.
- inigyou - 5 hours ago
  
  That's a decade old take. I don't think people are doing microservices and more.
  - ericyd - 18 minutes ago
    
    oh yeah definitely glad we have no more microservices in prod anywhere, that would be a mess
  - tomnipotent - 3 hours ago
    
    More than two decades, folks were grumbling about this in the SOA days.
kawsper - 10 hours ago

I think the clearest example of that is that people think Redis can only function as a cache that loses data on crash or shutdown.
I think that’s because people replaced Memcached with Redis, and expect the same from it.
a34729t - 4 hours ago

AOF at scale causes failures, so you turn it off. Still makes a great cache though.

AussieWog93 - 13 hours ago

I've done a bunch of Flask work over the past couple of years - not full time but as part of the tech stack for my small eCommerce business. Have run into all kinds of footguns and weirdness with MongoEngine, SQLAlchemy, Celery (seriously, if you value your sanity, don't use Celery!), the Python stacks for Google, eBay and Shopify but never Redis.

Perhaps that's because I'm not giving admin access to random people who think that Redis is a persistent storage, but honestly it's one of those technologies I'd describe as absolutely rock solid and well designed. The API is dead basic and every time I need to do something slightly weird, there's a sensible and well thought out way to achieve it.

hosteur - 11 hours ago

I am currently in the process of starting a project with Flask, SQLAlchemy, Celery. Say more about why I should avoid Celery and what to use instead.
- AussieWog93 - 8 hours ago
  
  Things like chaining, groups, named queues just don't work the way you'd think they would. There's a lot of footguns and things require weird workarounds. Error reporting is misleading.
  It's not bad enough where I had to pull it from the old project that used it, but going forward the new ones used a vibecoded queueing system that was genuinely more reliable than Celery but consumed a lot of memory (RSS inflation). Have then shifted to rq and at least for now it seems to "just work". You're better off doing anything custom/complex (like dependencies, or progress updates across multiple tasks) directly yourself in Redis anyway; since half the time Celery's less-well-trodden inbuilt features don't work the way they should anyway.
  - msandford - 6 hours ago
    
    Huh that's interesting! I found celery to mostly match my expectations. I used it in a couple of django apps. My only real foot gun was around having to set an EAGER setting for local development or tasks never got executed.
    How did you find your expectations and celery's actual semantics to be different? I'm trying to document well and it seems like I might have some implicit assumptions that I could make explicit, but I don't know what they are since they're already in my head and matching celery it seems.
- bloppe - 5 hours ago
  
  What are you using Celery for? Do you need to be able to recover from a reboot or crash with your queues intact? Is it a distributed system as opposed to a single machine? Do you have complex multi-step workflows?
  If you answered no to each question, just `import Queue from queue`.
  - hosteur - 4 hours ago
    
    > What are you using Celery for?
    Things like provisioning, deploying, and eventually destroying cloud instances (VMs) on-demand when a user buys a specific service.
    > Do you need to be able to recover from a reboot or crash with your queues intact?
    Yes, I expect the queue to be durable.
    > Is it a distributed system as opposed to a single machine?
    Currently, everything runs on a single machine. But I expect it will eventually have to be split up. Although I do not expect it to be massively distributed or very complex.
    > Do you have complex multi-step workflows?
    Depends on what you mean by complex but Multi-step, yes.
alt227 - 11 hours ago

> every time I need to do something slightly weird, there's a sensible and well thought out way to achieve it.
In my world cache systems like memcached and redis are just that, a cache to put and get from. Possibly use some invalidation system like tagging.
What can you do with a cache system that is 'wierd'? What are people doing with caches other than just caching data?
Genuinely interested.
- WJW - 6 hours ago
  
  Non-caching things I regularly see people do with Redis:
  - Rate limits for API endpoints via the leaky bucket algorithm
  - Feature flags and stats tracking
  - Websocket pub/sub
  - Background job queue
  In general, lots of things that need to survive deploys (so they can't be in-memory in the app) and/or they need to be coordinated across multiple horizontally scaled servers and/or things that prefer to be in a data structure which is slightly awkward to stick in a database table.
- smw - 3 hours ago
  
  Can do fantastically weird stuff with Lua scripts in Redis/Valkey
- AussieWog93 - 8 hours ago
  
  No, you're right. Nothing crazy. But things like counting API usage across threads with INCRBY, or debounced HTML cache clears, or even an actual light db with persistence (AOF), and everything just working.
- kawsper - 10 hours ago
  
  We had Rails writing to memcached, and nginx pulling from memcached for full page caching.
  At some point someone decided to gzip all writes into memcached, and our site looked really fun for a while.
- boesboes - 10 hours ago
  
  I’ve done moving window rate limiting using redis to do atomic rate calculations etc.
  That requires some weirdness
  - alt227 - 10 hours ago
    
    > moving window rate limiting
    So does that mean you are tracking how many times data is being entered into redis, and rejecting it if the entry rate is too high?
    Why would you not track this before, at the point of calculating the data to enter into redis, rather than querying redis to see how much data is entered in a given timeframe?
    Again, genuinely curious as to the reason for architectural decisions.
    
    WJW - 6 hours ago
    
    Not GP, but I think they mean usecases like limiting how many times any given IP address can access an API to a certain amount of calls per minute. For example, you might want to restrict login attempts to at most 10 per minute per IP to prevent people trying out lists of common passwords.
    This is fairly easy to do if your apps runs on a single server, but many companies run multiple servers and load balance requests among them. Those servers need some sort of coordination mechanism to keep track of the rate limits and their current state. Redis has dedicated instructions these days to do this, and in the old days there were plethora of libraries that use embedded Lua scripts to do the same thing.
    
    boesboes - 3 hours ago
    
    Spot on! It was for a decently sized SaaS app; 10k+ request per minute & and a LOT of spam traffic from china a.o. we needed to limit. The app ran across 10+ servers, this is also why we put it in the app using redis and not with something like nginx rate limiting.
    I don't exactly remember how i implemented it, but it basically did a single call to redis to count the request for the IP and check the limits.
    Another usecase where the more advanced data types & operations of redis are usefull, is for job queues, since you can atomically move a job from the 'queue' to the 'processing' list, thus preventing loosing jobs if the processor crashes after pulling it orso. But we do run all those on persisted redis stores, for safety :)
    And if i would do it all again, i'd probably just use postgres for anything i want to keep when things crash. Redis just kinda lives between a 'real' database and a pure volatile kv-cache

dyogenez - 2 hours ago

Memcached was a savior for caching when it launched. I love that it was created in 2003 by Brad Fitzpatrick for LiveJournal. Each post on a users feed could have different access restrictions, and this allowed posts (or entire pages) to be cached.

I used it with Ruby on Rails for many years. It sped up pages, and just worked.

The downside (and upside for speed) is (and always was) that cache was saved in memory not disk. This meant hosting would be expensive if you have a large scale site with a wide amount of data to cache.

Solid cache has been a savior for those cases for me. We have over 100gb of cache for a project I'm working on, and it's stored in postgres on disk, with fast lookups with an index and expirations that happen automatically in Rails to delete those rows.

If I had a smaller cache need and was already using Redis, I'd probably just use that. But if speed was the number 1 factor, and I'd try benchmarking Memcached vs Redis.

bawolff - 13 hours ago

I like memcached, but its really not redis's fault if you set it up as a volatile cache but people treat it as a persistent data store.

The comparison is especially weird as memcached is also not persistent.

roncesvalles - 13 hours ago

At many companies (I want to say most), Redis is seen as an actual durable production database and operated that way, not just as a cache that can disappear at any time. It's not unreasonable for a new dev to assume this unless told otherwise.
- hnlmorg - 11 hours ago
  
  That’s not been my experience.
  Ultimately though, regardless of whether you’re experience is true for the wider industry or not, if you’re letting a junior dev who refuses to read product documentation the responsibility of architecting production systems, then your problem isn’t Redis.
- bawolff - 13 hours ago
  
  Sure, but that is an internal documentation failure not a redis failure. It feels incredibly unfair to blame redis for that.
  - inigyou - 5 hours ago
    
    No one assumes memcached is persistent or Postgres isn't. Why does only Redis/Valkey have this problem?
    
    stackskipton - 5 hours ago
    
    Because it can be both depending on the command line flags sent to it.
    Also, because it's so easy to setup, most DevOps/SREs/Ops just chuck into production without reading about which flags to set because we are not informed it's a requirement until 11th hour.
    
    nightpool - 4 hours ago
    
    You're making inigyou & OP's point for them. Redis is a great technology, but its design (supporting both persistence and non-persistence modes) makes it much easier to misuse and much more likely to be misused compared to Postgres and Memcached. That's a design issue, not just an internal documentation issue.
- 6 hours ago

[deleted]

psadri - 3 hours ago

The comment about memcached being ephemeral is orthogonal to whether people will use it as if it persistent. If the cache appears to get hit 99.9% of the time and is always there, sooner or later people will write code that relies on that behavior.

Maybe the client libraries can help by returning nulls 10% of time, in dev mode?

tracker1 - 3 hours ago

I remember one "fun" feature in Memcached, is that each client did it's own hashing/sharding system... and when trying to share cached values across platforms/languages in order to further reduce resources, that was fun... writing a custom .Net/C# client to match the Java implementation. This was in the .Net 1/2 timeframe around 2002-2003 before NuGet.

That said, it's interesting when you learn in practice that sometimes an N+1 problem is actually faster as N+1 than trying to query across separate DBMS systems.

bel8 - 2 hours ago

> it's interesting when you learn in practice that sometimes an N+1 problem is actually faster as N+1 than trying to query across separate DBMS systems.
This is specially visible in SQLite workflows.
The roundtrip for querying a local SQLite file is so fast that it's passable to execute N SELECTS inside a loop instead of a single SELECT with a JOIN, for example.

jitl - 6 hours ago

memcached is about a bazillion times faster than redis at doing the simple KV cache job. it’s got threads. it’s highly optimized to do its one job super well, where redis is more a arbitrary shared Python heap kinda thing with all the data structures and single thread and whatnot.

at notion we use redis for a lot of things, but actual caching we leave to memcached

foobarian - 4 hours ago

Can confirm it's not that much faster at k/v. 300 vs 350 microseconds per read on average. The single thread thing doesn't matter much since it's not cpu bound, it's reactive I/O
citrin_ru - 5 hours ago

Threads are not free - they allow to use more CPU cores but if you load is not too high than with a single thread memcached uses less CPU than with multiple.

freediddy - 4 hours ago

Burning memory for a pure memory/RAM service like memcached in today's environment is not going to work given the price of memory and especially for larger customers. Especially in cloud environments, it's going to be inordinately expensive so having hybrid solutions like Redis and their flash memory solution is probably going to be the compromise going forward.

dijit - 4 hours ago

Sometimes you just need some networked RAM, man.
- 3 hours ago

[deleted]

jessinra98 - 3 hours ago

I inherited a Django app once where Redis was doing everythinga nd a single bad pub/sub message locked up the whole thing. We pulled the cache layer into memcached.

drchaim - 8 hours ago

WHEN do you move to Redis/Memcached? None of my projects have exceeded 1000 rps at peak, and in none of them have I felt the need to move from unlogged PostgreSQL tables to Redis.

Just trying to get a sense of where people draw the line.

calpaterson - 8 hours ago

Mostly is no rule, adding a cache can just save you from having to buy a bigger database instance in many cases.
The most common first thing to cache is getting the current user, because this ends up being a very hot path for most stateless systems. Because you need to get the current user for almost every request, it's quite easy for getting the current user to be 50% of database load: first you get the user, then you do the thing. tada, user lookup is now half your app by volume
foobarian - 4 hours ago

You could ask the opposite question too, could you move your bespoke Redis/mcd workload to a boring old database? We have some workloads from 100k-1M ops/s that I would love to streamline but the load is too high. So... 100k in this case? :-)
hylaride - 4 hours ago

As always it depends. Are there noticeable bottlenecks/latency in the app? No? Why pre-optimize then?
Look at this image: https://cs61.seas.harvard.edu/site/img/storage-hierarchy.png
At scale, the timing and order of magnitude increases in latency can add up. Caching the most requested data the higher you go can keep up performance (at added cost). On a busy website, that could be things like session tokens or other data that is part of every request. On a landing page, it could be images or other static data (I mean, you'd use a CDN for this, but you get the idea). Database calls can be expensive (computationally and IO wise), so if you can recalculate and cache certain operations, you can keep up.
Also, do you really need memcached/redis? If you have session affinity, you can also have nodes each keep their own caches in memory, with the caveat that if there's a failover, you'll have to re-fetch the data. Redis/memcached would be more of a shared cache, for things that you may not want to interrupt the user if they hit a different front-end endpoint.
It also doesn't have to be a cache. We've used redis for distributed task coordination as a shared state with the caveat that if something happened to redis, we'd just restart the task.
TL;DR If you need a SHARED cache when the performance of your app slows down enough that the cost of caching makes up for it.

inigyou - 5 hours ago

Redis got replaced by Valkey FYI. It still has the same problems except for being driven by AI marketing, and even if you are doing AI you will find it has useful features, like vector lookup.

- 15 hours ago

[deleted]

dosint21h - 10 hours ago

Perhaps you should try aerospike which provides a data-in-memory mode and reliable persistence and of course, automatic scale-out. Your mum will stop worrying about you and your job once and for all.

noirscape - 4 hours ago

Great post. Redis is just kinda overkill whenever I've had to use it. Memcached by contrast is very simple, fast and works without needing to do much fiddling with it.

One big tip I should recommend is to increase the default memory size limit to something more realistic for modern hardware (and arguably this should just be increased on the upstream's side as well, instead of making everyone reconfigure shitty defaults). It's very easy to exceed the memcached default key value, since it's just 1mb; the maximum size of memcached as a whole is 64mb, which is similarly very low. Outside of that, it works very well and the lack of persistence is great at making it not do things it's not supposed to do (which is a big problem with Redis' feature creep, the projects mainpage promoting AI drivel alone should point towards that.)

kijin - 16 hours ago

Redis works great as a cache, but there are a few things you need to do in order to use it reliably as a cache.

1) Wrap your client library so that it's impossible to store anything without an expiry date. You don't want 6-months-old data suddenly coming up in your app!

2) Either turn off persistence, or use a separate database for the cache. In other words, don't mix volatile data with stuff you actually care about.

3) Set up a reasonable maxmemory value with an appropriate maxmemory-policy, so that Redis doesn't eat up all your RAM.

4) Resist the urge to use complex data structures. If you try to update a single field on an expired hash, you will end up with an incomplete object.

If you don't want all that hassle, then yes, Memcached probably works better out of the box.

dvt - 16 hours ago

> 1) Wrap your client library so that it's impossible to store anything without an expiry date. You don't want 6-months-old data suddenly coming up in your app!
No need for this client-side complexity, as you should be using `allkeys-lru`. FWIW, should likely be doing this anyway, as (generally speaking) all data stored in Redis is usually regarded as volatile because of what Redis actually is.
- kijin - 14 hours ago
  
  > as (generally speaking) all data stored in Redis is usually regarded as volatile because of what Redis actually is.
  If you know this already, then you didn't need to read OP or any of this thread. :)
  The problem is that Redis tries very hard to position itself as a persistent data store, with defaults that lean toward persistence (no default eviction policy). Beginners need to fight these defaults every step of the way if all they want is a cache.
  - dvt - 14 hours ago
    
    > The problem is that Redis tries very hard to position itself as a persistent data store
    What are you talking about? On their website, the top 3 use cases (under the Platform menu) are: caching, streaming, and session management. Literally all of these three are volatile.
    
    FridgeSeal - 12 hours ago
    
    I haven encountered a _shocking_ number of people who treat Redis as a persistent store. There mere mention that it has some kind of persistence machinery is enough to convince some that is therefore durable and stable and should be treated like a DB.

deepsun - 12 hours ago

To me the only difference that mattered is that Redis allows to do range queries, while Memcached only by key. Aka TreeMap vs HashMap. Or B-tree index vs Hash index.

jszymborski - 16 hours ago

An article praising memcached and no mention of the feral bunny mascots.

abound - 15 hours ago

> And look at those cute little mascots at the top!
- jszymborski - 14 hours ago
  
  My bad!

tempest_ - 16 hours ago

I stopped using memcached a decade a go in favour of Redis and now use valkey.

Never felt the need to go back to memcached except when a legacy dependency needed it.

jimbokun - 15 hours ago

OK.
What do you think of the argument made in the article?
- tempest_ - 15 hours ago
  
  I don't want my cache to silently fail.
  Clustering redis is not that hard even if you do it manually and I have only had to do it once.
  I never use redis persistence and have a max size set with LRU or whatever the application requires.
  With memcached I remember having to mess around the LD_LIBRARY path to link whatever python module I was using at the time
  - crabmusket - 14 hours ago
    
    > silently fail
    Mature ops would be tracking cache hit ratios right?
    It sounds like memcached would be really good in a use case where you really just need an optional stateless pure cache with absolutely zero rope to hang yourself on. A use case where "cache hit ratio" is the goal, not "fiddly in-memory data store".
    
    foobarian - 4 hours ago
    
    > absolutely zero rope to hang yourself on
    Yeah I thought so too. Google "memcache slab starvation" if you want the long story
    
    tempest_ - 14 hours ago
    
    > Mature ops would be tracking cache hit ratios right?
    Sure, and sentry integrates well with redis in python which is what I use primarily with redis.
    I don't think memcached is bad, I just think its old and industry has moved to redis because it offers more while covering the previous use case.
    Calling redis fiddly is a mischaracterization. For many use cases I have not had to think more than 30s to setup redis.
    (also when I say redis I mean Valkey at this point, even if they are starting to diverge)
    
    hparadiz - 13 hours ago
    
    There's basically zero reason to use redis. Pretty much every rdbms like mariadb, postgres, etc is just as fast. So then why redis? It's basically needless complexity in your system.
    
    robotresearcher - 13 hours ago
    
    Postgres etc are more complex than Redis, are they not?
    Does your argument assume you already have a database, so you might as well use it for your cache mechanism?
    
    hparadiz - 12 hours ago
    
    Modern rdbms databases already have an in-memory cache. For 99% of projects there's no actual difference. The round trip will end up around 12-22 ms in all best possible cases.
    
    nchmy - 10 hours ago
    
    If you're getting 12-22ms latency for your cache reads, the network is your bottleneck. If stored locally, you would get many orders of magnitude faster than that.