Golang's big miss on memory arenas

avittig.medium.com

145 points by andr3wV 8 days ago


bilbo-b-baggins - 17 hours ago

Man this person is mediocre at best. You can do fully manual memory management in Go if you want. The runtime is full of tons of examples where they have 0-alloc, Pools, ring buffers, Assembly, and tons of other tricks.

If you really want an arena like behavior you could allocate a byte slice and use unsafe to cast it to literally any type.

But like… the write up completely missed that manual memory management exists, and Golang considers it “unsafe” and that’s a design principle of the language.

You could argue that C++ RAII overhead is “bounded performance” compared to C. Or that C’s stack frames are “bounded performance” compared to a full in-register assembly implementation of a hot loop.

But that’s bloody stupid. Just use the right tool for the job and know where the tradeoffs are, because there’s always something. The tradeoff boundary for an individual project or person is just arbitrary.

tptacek - a day ago

The vibe I get from this post is of someone who hasn't routinely used arenas in the past and thinks they're kind of a big deal. But a huge part of the point of an arena is how simple it is. You can just build one. Meanwhile, the idea that arena handles were going to be threaded through every high-allocation path in the standard library is fanciful.

ideal_gas - a day ago

> By killing Memory Arenas, Go effectively capped its performance ceiling.

I'm still optimistic about potential improvements. (Granted, I doubt there will be anything landing in the near future beyond what the author has already mentioned.)

For example, there is an ongoing discussion on "memory regions" as a successor to the arena concept, without the API "infection" problem:

https://github.com/golang/go/discussions/70257

cafxx - 15 hours ago

There's a bunch of activity ongoing to make things better for memory allocation/collection in Go. GreenTeaGC is one that has already landed, but there are others like the RuntimeFree experiment that aims at progressively reduce the amount of garbage generated by enabling safe reuse of heap allocations, as well as other plans to move more allocations to the stack.

Somehow concluding that "By killing Memory Arenas, Go effectively capped its performance ceiling" seems quite misguided.

sc68cal - 14 hours ago

This does make me appreciate some of the decisions that Zig has made, about passing allocators explicitly and also encouraging the use of the ArenaAllocator for most programs.

Since Zig built up the standard library where you always pass an allocator, they avoided the problem that the article mentions, about trying to retrofit Go's standard library to work with an arena allocator.

Although, that's not the case for IO in Zig. The most recent work has actually been reworking the standard library to be where you explicitly pass IO like you pass an allocator.

But it's still a young language so it's still possible to rework it.

I really do enjoy using the arena allocator. It makes things really easy, if your program follows a cyclical pattern where you allocate a bunch of memory and then when you're done just free the entire arena

silisili - a day ago

I'm a bit split on this one.

Simple arenas are easy enough to write yourself, even if it does make unidiomatic code as the author points out. Pretty much anything that allocates tons of slices sees a huge performance bump from doing this. I -would- like that ability in an easier fashion.

On the other, hand, new users will abuse arenas and use them everywhere because "I read they are faster", leading to way worse code quality and bugs overall.

I do agree it would become infectious. Once people get addicted to microbenchmarking code and seeing arenas a bit faster in whatever test they are running, they're going to ask that all allocating functions often used (especially everything in http and json) have the ability to use arenas, which may make the code more Zig-like. Not a dig at Zig, but that would either make the language rather unwieldy or double the number of functions in every package as far as I can see.

notepad0x90 - 15 hours ago

What's the downside of having one API to pre-allocate memory to be used by the GC, and a second API to suspend/resume GC operations? When you run out of pre-allocated memory, it will resume GC operations automatically.

I'm naively thinking, the performance bottleneck is not with tracking allocations but constantly freeing them and then reallocating. Let the GC track allocations, but prevent it from doing anything else so long as it is under the pre-allocated memory limit for the process. When resumed, it will free unreferenced memory. That way, the program can suspend GC before a performance sensitive block and resume it afterwards. API's don't need to change, because the change at all that way.

gethly - 17 hours ago

I was considering few use cases where arena would make sense and I encountered the "abandoned" arena library in the standard library and then read on why it was never enabled. And yes, in those extremely rare situations, it would be nice to have them. But generally, they make little sense for Go and projects Go is used in. So I definitely do not share any of the opinions from the blog post.

There is Odin, Zig or Jai(likely next year) as new kids on the block and alternatives to the cancer that is Rust or the more mainstream C, C++, Java or even C#.

Go definitely does not have to try and replace any of them. Go has its own place and has absolutely no reason to be fearful of becoming obsolete.

After all, in rare/extreme cases, one can always allocate big array and use flatbuffers for data structures to put into it.

yxhuvud - 3 hours ago

> When your software team needs to pick a language today, you typically weigh two factors: language performance and developer velocity.

There are obviously other factors in play as well, or languages that are really good at both but weak in other areas (like adoption and mind share) would dominate. And I for sure don't see a lot of Crystal around..

rkerno - 14 hours ago

I think the overall sentiment with this post is sound, but arenas aren't the answer to Go's performance challenges. From my perspective, possibly in an effort to keep the language simple, Go's designers didn't care about performance. 'let the GC handle it' was the philosophy and as a result you see poor design choices all the way through the standard library. And the abstracting everything through interfaces then compounds the issue because the escape compiler can't see through the interface. The standard library is just riddled with unnecessary allocations. Just look at the JSON parser for instance and the recent work to improve it.

There is some interesting proposals on short term allocations, being able to specify that a local allocation will not leak.

Most recently, I've been fighting with the ChaCha20-Poly1305 implementation because someone in their 'wisdom' added a requirement for contiguous memory for the implementation, including extra space for a tag. Both ChaCha20 and Poly1305 are streaming algorithms, but the go authors decide 'you cannot be trusted' - here's a safe one-shot interface for you to use.

Go really needs a complete overhaul of their Standard Library to fix this, but I can't see this ever getting traction due to the focus on not breaking anything.

Go really is a great language, but should include performance / minimise the GC burden as a key design consideration for it's APIs.

nu11ptr - a day ago

A better route for something like Go IMO is to move to a compacting collector, this would allow them to move to a bump allocator like Java for super fast allocations and would make deallocation effectively "free" by only moving live objects. They may need to make it generational so they aren't constantly moving long lived objects, but that is a memory vs cpu trade off (could be one more GC flag?). If I recall, the previous objection was because of CGo, which would require pinning (since C wouldn't tolerate moved pointers), but every Go dev I know hates CGo and generally avoids it, plus I see they added "runtime.Pinner" in 1.21 which should solve that I suspect (albeit it would suddenly be required I expect for pointers retained in C). Is anyone aware of what other challenges there are moving to a compacting collector/bump allocator?

pkulak - a day ago

My guess is that when you measure, an arena is not worth the trouble when you run a generational GC, which essentially uses an arena for the eden space already. And if you have an arena, it's probably very short lived and would otherwise live entirely in eden.

acmj - 19 hours ago

Going from something like "Go lacks a builtin arena allocation" to "Go risks becoming the COBOL" is a long stretch. First, Go is slower than C/C++/rust without complex memory allocation. Introducing an arena allocator won't fix that. Second, arena allocation often doesn't work for a lot of allocation patterns. Third, plain arena allocator is easy to implement when needed. Surely a builtin one would be better but Go won't fall without it.

didibus - a day ago

Interesting that it never talks about direct competitors to the "middle ground" as well, like Java, C#, Erlang, Haskell, various Lisps, etc.

aktau - 8 hours ago

(I agree with other commenters' assessment about the importance of the authors complaints, and recommend others checkout the Go memory regions proposal.)

For those interested, here's an article where Miguel Young implements a Go arena: https://mcyoung.xyz/2025/04/21/go-arenas/. I couldn't find references to Go's own experimental arena API in this article. Which is a shame since it'd be if this knowledgeable author traded them off. IIUC, Miguels version and the Go experimental version do have some important differences even apart from the API. IIRC, the Go experimental version doesn't avoid garbage collection. It's main performance benefit is that the Go runtimes' view on allocated memory is decreased as soon as `arena.Free` is called. This delays triggering the garbage collector (meaning it will run less frequently, saving cycles).

bborud - 7 hours ago

If Go refuses to add complexity to gain performance and cannot engineer its way around the GC, it effectively resigns from the pursuit of the high-performance tier.

I'm completely okay with that. In fact I much prefer it.

Writing high performance code is expensive in any language. Expensive in terms of development time, maintenance cost, and risk. It doesn't really matter what language we are talking about. The language usually isn't the limiting factor. Performance is usually lost in the design stage - when people pick the wrong strategies for solving a particular problem.

Not all code needs to be as fast as it can be. The priority for any developer should always be:

  1. Correct
  2. Understandable
  3. Performant
If you haven't achieved 1, then 2 and 3 doesn't matter. At all. If you haven't achieved 2, then the lifetime cost and risk introduced by your code may not have an acceptable cost. When I was inexperienced I only focused on 3. The code needed to be fast. I didn't care if it was impossible for others to maintain. That works if you want no help. Ever. But that isn't how you create lasting value.

Good programmers achieve all three and respect the priority. The programmers you don't really want on your team only focus on 3. Their code will be OK in the short term, but in the long term it tends to be a liability. I have seen several commercial products have to rewrite huge chunks of code that was impenetrable to anyone but the original author. And I have seen original authors break under the weight of their own code because they can no longer reason about what it does.

Go tries to not be complex. That is its strength. Introducing complexity that isn't needed by the vast majority of developers is a very bad idea.

If I need performance Go can't deliver there are other languages I could turn to. So far I haven't needed to.

(From the other comment I surmise that there are plenty of tricks one can use in Go to solve scenarios where you need to resort to trickery to get higher performance for various cases. So it seems that what you are asking for isn't even needed)

nemothekid - 14 hours ago

>If you choose lower-level languages like Rust, your team will spend weeks fighting the borrow checker, asynchronicity, and difficult syntax.

It's interesting the author decides to describe Rust in this way, but then spends the next 90% of the article lambasting the Go authors for having the restraint to not turn Go into Rust.

Arenas are simple to write, and if you need one, there are a lot of implementations available. If you want the language to give you complete flexibility on memory allocations then Go is the wrong language to use. Rust and Zig are right there, you pay upfront from that power with "difficult syntax".

nmilo - 14 hours ago

Implicit context [1] was one of the coolest features of a programming language I’ve ever seen that no one has ever implemented. And I’m really not sure why. Not just Go but most languages have this context passing problem with varying degrees of solution quality, making this implicit and built in could have opened up so many possibilities, more than just arenas.

[1]: https://youtu.be/ciGQCP6HgqI

andrewcamel - a day ago

Philosophical question, but after reaching critical mass, should languages even aspire to more? I.e. do you risk becoming "master of none"? What's wrong with specialist languages? I.e. best of breed vs best of suite?

I agree with author Go is getting squeezed, but it has its use cases. "COBOL of could native" implies it's not selected for new things, but I reach for it frequently (Go > Java for "enterprise software" backends, Go > others for CLI tools, obviously cloud native / terraform / CI ecosystem, etc.).

However in "best of suite" world, ecosystem interop matters. C <> Go is a pain point. As is WASM <> Go. Both make me reach for Rust.

yunnpp - 15 hours ago

> One concern was that Arenas introduced “Use-After-Free” bugs, a classic C++ problem where you access memory after the arena has been cleared, causing a crash.

In Rust, can the lifetime of objects be tied to that of the arena to prevent this?

Asking as a C/C++ programmer with not much Rust experience.

RohMin - a day ago

I wish Odin could gain more traction

Yokohiii - 21 hours ago

I am a bit confused about the API pollution issue with arenas. I think it's a valid point to think about, but at the same time I don't think the average dev will do any extra steps for the faster thing to do.

thorn - a day ago

I would like to see a reference to the place/proposal where Go team has actually rejected the idea of arenas. I have not see this ever in their issues.

tpolzer - a day ago

I wonder whether it would be possible to retrofit Arena allocation transparently (and safely!) onto a language with a moving GC (which IIUC Go currently is not):

You could ask the programmer to mark some callstack as arena allocated and redirect all allocations to there while active and move everything that is still live once you leave the arena marked callstack (should be cheap if the live set is small, expensive but still safe otherwise).

kgeist - 21 hours ago

>Instead of asking the runtime for memory object-by-object, an Arena lets you allocate a large pool of memory upfront. You fill that pool with objects using a simple bump pointer (which is CPU cache-friendly), and when you are done, you free the entire pool at once

>They have been trying to prove they can achieve Arena-like benefits with, for example, improved GC algorithms, but all have failed to land

The new Green Tea GC from Go 1.25 [0]:

  Instead of scanning objects we scan whole pages. Instead of tracking objects on our work list, we track whole pages. We still need to mark objects at the end of the day, but we’ll track marked objects locally to each page, rather than across the whole heap.
Sounds like a similar direction: "let's work with many objects at once". They mention better cache-friendliness and all.

[0] https://go.dev/blog/greenteagc

willtemperley - a day ago

Isn’t a memory arena an application level issue? Like with Arrow I can memory map a file and expose a known range to an array as a buffer.

cyberax - a day ago

Go now has memory regions, an automatic form of arenas: https://go.googlesource.com/proposal/+/refs/heads/master/des...

I think the deeper issue is that Go's garbage collector is just not performant enough. And at the same time, Go is massively parallel with a shared-everything memory model, so as heaps get bigger, the impact of the imperfect GC becomes more and more noticeable.

Java also had this issue, and they spent decades on tuning collectors. Azul even produced custom hardware for it, at one point in time. I don't think Go needs to go in that direction.

Pet_Ant - 21 hours ago

> The real reason was the “Infectious API” problem. To get performance benefits, you can’t just create an arena locally; you have to pass it down the call stack so functions can allocate inside it. This forces a rewrite of function signatures.

Sorry, but it doesn't seem that difficult (famous last words). Add a new implicit parameter to all objects just like "this" called "thisArena". When a call to any allocation is made, pass "thisArena" implicitly, unless something else passed explicitly.

That way the arena is viral all the way down and you can create sub-arenas. It also does not require actually passing the arena as parameter.

You don't even need to rewrite any new code, just recompile it.

stlava - a day ago

At the end of the day there has to be a tradeoff between ease of use and performance. Having spent a lot of time optimizing high throughput services in go, it always felt like I was fighting the language. And that's because I was... sure they could add arenas but that just feels like what it is, a patch over the fact you're working alongside a GC.

tmaly - 20 hours ago

What's to prevent someone from implementing arenas in the user space as a stand alone module?

- a day ago
[deleted]
catigula - a day ago

>If you choose TypeScript or Python, you’ll hit a performance wall the moment you venture outside of web apps, CRUD servers, and modeling.

This really isn't very accurate. It is for Python, but JavaScript is massively performant. It's so performant that you can write game loops in it provided you work around the garbage collector, which, as noted, is a foible golang shares.

The solution is the same, to pre-allocate memory.

YetAnotherNick - 7 hours ago

> It’s brilliant code, but it’s not the kind of Go most teams write or can maintain.

Minimizing allocation inside a loop is not a huge insight, nor very rare in any language including python.

openasocket - 18 hours ago

Frankly, it’s not a lack of arenas that is holding Go back. It’s the fact that, in 2025, we have a language with a runtime that is neither generational nor compacting. I can’t trust the runtime to perform well, especially in memory-conscious, long-running programs.

bluecalm - a day ago

Arenas is one of those patterns that very easy to underestimate. I didn't know about it when I started programming and I run into huge performance issue where I needed to deallocate a huge (sometimes tens of GBs consisting of millions of objects) structure just to make a new one. It was often faster to kill the process and start a new one but that had other downsides. At some point we added a simple hand written arena-like allocator and used it along with malloc. The arena was there for objects on that big structure that will all die at the same point and malloc was for all the other things.

The speed-up was impossible to measure because deallocation that used to take up to 30 seconds (especially after repeat cycles of allocating/deallocating) was now instant.

Even though we had very little experience it was trivial to do in C. Imo it's critical for performance oriented language to make using multiple allocators convenient. GC is a known performance killer but so is malloc in some circumstances.

jeffrallen - a day ago

The author is confused about how performance tuning works. Step one, get it right. Step two, see if it's fast enough for the problem at hand.

There is almost never a step three.

But if there is, it's this: Step three: measure.

Now enter a loop of "try something, measure, go to step 2".

Of the things you can try, optimizing GC overhead is but one of many options. Arenas are but one of many options for how to do that.

And the thing about performance optimizations are that they can be intensely local. If you can remove 100% of the allocations on just the happy path inside of one hot loop in your code, then when you loop back to step two, you might find you are done. That does not require an arena allocator with global applicability.

Go gives realistic programmers the right tools to succeed.

And Go's limitations give people like the author plenty of ammunition to fight straw men that don't exist. Tant pis.

convolvatron - a day ago

one question that always plagues me when we talk about mixing manual and automatic memory systems is...how does it work? if we have a mixed graph of automatic and manual objects, it seems like we dont have a choice except to have garbage collection enabled for everything and make a new root (call it the programmer) that keeps track of whether or not the object has been explicitly freed.

since we still have the tracing overhead and the same lifetimes, we haven't really gained that much by having manual memory.

D's best take at this is a compile-time assert that basically forbids us from allocating GC memory in the affected region (please correct me if I'm wrong), but that is pretty limited.

does anyone else have a good narrative for how this would work?

pjmlp - a day ago

They could probably learn one or two things on how Java and .NET do arenas, just saying.

ErroneousBosh - a day ago

Back in the 1960s, my parents were one of the "first generation" of what we'd call "sport climbers" now. They and their friends climbed all over Scotland, and once they'd done that they climbed in Italy and Austria. They packed everything in, and they packed it all back out, camping, bothying, and bivvying in all conditions.

They and their friends spoke disdainfully of the "short toothbrush brigade". These were the climbers who sawed the handles off their toothbrushes, to save like four grammes in their backpack weight. Massively inconveniencing themselves but they sure were a teaspoon lighter!

This feels like that. Really do you think that playing childish pranks on the garbage collector is going to speed up anything? Pick a faster sorting algorithm or something.