Finding and fixing Ghostty's largest memory leak

mitchellh.com

621 points by thorel 3 days ago


quantummagic - 3 days ago

This is great news! Well done to everyone who helped sort it out. It was a problem noted by users in a thread here just last week, https://news.ycombinator.com/item?id=46460319

While Claude Code might have been the reason this bug became triggered by more people, there are some of us who were hitting it without ever having used Claude Code at all. Maybe the assumption about what makes a page non-standard, isn't as black-and-white as presumed. And I wonder if the leak would have been triggered more often for people who use scrollback-limit = 0, or something very small.

Probably not a huge deal, but it does seem the fix will needlessly delete and recreate non-standard pages in the case where the new page needs to be non-standard, and the oldest one (that needs to be pruned) already is non-standard and could be reused.

jrpelkonen - 3 days ago

Great write-up. And, thanks mitchellh for Ghostty, I switched to it last year, and have not regretted it.

However, I am a somewhat surprised that the fix is reserved for a feature release in a couple of months. I would have expected this to be included in a bug fix release.

reactordev - 2 days ago

The moment you started talking about pages, I was like: “Ok, obviously memory pooled” and yup, it is. Then I said “obviously ring buffered” and yeah, essentially your scroll back reuse. Then I knew exactly where the bug was before getting to that part, not freeing the pages memory properly and sure enough - bingo! With some great looking diagrams of memory space alignment.

Kudos, that was a good read. Just remember that every time you do something novel, there’s potential for leaks :D

neobrain - 3 days ago

Funny timing, I moved to Ghostty this week and just today I ran into OOM crashes in Ghostty while developing a terminal UI app. Coincidentally this TUI has a tab bar that looks like this, where UTF8 icons are used for recognizability and activity indicators (using © and € as placeholders here):

    1|Flakes ©    2|Installed ©    3|Store © €    4|Security © €
   ──────────────────────────────────────────────────────────────
This works fine normally, but resizing the terminal would quickly trigger the crash - easy to avoid but still annoying!

I was already preparing myself to file a bug report with the easy repro, but this sounds suspiciously close to what the blog post is describing. Fingers crossed :)

(EDIT: HN filters unicode, booo :( )

jhhh - 2 days ago

This feels like a case of guessing at something you could know. There are two types of allocations that each have a size and free method. The free method is polymorphic over the allocations type. Instead of using a tag to know absolutely which type an object it is you guess based on some other factor, in this case a size invariant which was violated. It also doesn't seem like this invariant was ever codified otherwise the first time a large alloc was modified to a standard size it would've blown up. It's worth asking yourself if your distinguishing factor is the best you can use or perhaps there is a better test. Maybe in this case a tag would've been too expensive.

hotpotat - 3 days ago

@mitchellh what did you use for the memory visualizations? Looks nice, and the website plays well with mobile. Whats the stack?

stephc_int13 - 3 days ago

I've been following the development of Ghostty for a while and while I have the feeling that there is a bit of over-engineering in this project, I find this kind of bug post mortem to be extremely valuable for anyone in love with the craft.

bryancoxwell - 3 days ago

Super accessible write up as someone unfamiliar with Ghostty and terminal emulators in general. Thanks!

kepano - 3 days ago

Reliable reproductions are so valuable.

andrewaylett - 2 days ago

Let me see if I can understand this properly:

There's a linear buffer of pages, most of which come from the pool. It's not clear to me under what conditions these are returned to the pool? Is it when the specific session terminates?

When a non-standard page reaches the point of being recycled, it'll instead be re-added to the list but with a standard size. That effectively leaks the extra space above the standard size. But when the buffer is released (because the session ends?) the pool is also released, which releases all the standard sized pages but leaks the custom-sized ones?

Which suggests that the issue may be even rarer than it initially looked to me: I tend to open a small number of sessions and then use them continuously, rather than starting new sessions during the lifetime of the process. If I never terminated a session, I would never fully leak the memory?

drob518 - 3 days ago

Why not just use a circular buffer for the scroll back? Why use blocks at all if you’re just going to recycle them anyway? That said, great write-up.

- 2 days ago
[deleted]
dangoodmanUT - 3 days ago

waiting for someone to say "this wouldn't have happen if you chose rust"