Rendering Crispy Text on the GPU

osor.io

416 points by ibobev 4 days ago


osor_io - 4 days ago

Author here, didn't expect the post to make it here! Thanks so much to everyone who's reading it and participating in the interesting chat <3

vecplane - 4 days ago

Subpixel font rendering is critical for readability but, as the author points out, it's a tragedy that we can't get pixel layout specs from the existing display standards.

kvemkon - 4 days ago

GTK4 moved rendering to GPU and gave up on RGB subpixel rendering. I've heard, that this GPU-centric decision made it impractical to continue with RGB subpixel rendering. The article shows it is possible. So perhaps, the reason for GTK was another one or the presented solution would have disadvantages or just not integrate in the stack...

xiaoiver - 4 days ago

If you're interested in how to implement SDF and MSDF in WebGL / WebGPU, take a look at this tutorial I wrote: https://infinitecanvas.cc/guide/lesson-015#msdf.

dcrazy - 4 days ago

The Slug library [1] is a commercial middleware that implements such a GPU glyph rasterizer.

[1]: https://sluglibrary.com/

vFunct - 4 days ago

I still don't understand why we need text rendered offline and stored in an atlas alongside tricks like SDFs, when GPUs have like infinite vertex/pixel drawing capabilities.. Even the article mentions writing glyph curves to an atlas. Why can't the shaders render text directly? There has to be a way to convert bezier to triangle meshes. I'm about to embark on a GPU text renderer for a CAD app and I hope to figure out why soon.

shmerl - 4 days ago

> One of those new OLEDs that look so nice, but that have fringing issues because of their non-standard subpixel structure

From what I understood, it's even worse. Not just non standard, but multiple incompatible subpixel layouts that OLEDs have. That's the reason freetype didn't implement subpixel rendering for OLEDs and it's a reason to avoid OLEDs when you need to work with text. But it's also not limited to freetype, a lot of things like GUI toolkits (Qt, GTK. etc.) need to play along too.

Not really sure if there is any progress on solving this.

> I really wish that having access to arbitrary subpixel structures of monitors was possible, perhaps given via the common display protocols.

Yeah, this is a good point. May be this should be communicated in EDIDs.

oofabz - 4 days ago

Very impressive work. For those who aren't familiar with this field, Valve invented SDF text rendering for their games. They published a groundbreaking paper on the subject in 2007. It remains a very popular technique in video games with few changes.

In 2012, Behdad Esfahbod wrote Glyphy, an implementation of SDF that runs on the GPU using OpenGL ES. It has been widely admired for its performance and enabling new capabilities like rapidly transforming text. However it has not been widely used.

Modern operating systems and web browsers do not use either of these techniques, preferring to rely on 1990s-style Truetype rasterization. This is a lightweight and effective approach but it lacks many capabilities. It can't do subpixel alignment or arbitrary subpixel layout, as demonstrated in the article. Zooming carries a heavy performance penalty and more complex transforms like skew, rotation, or 3d transforms can't be done in the text rendering engine. If you must have rotated or transformed text you are stuck resampling bitmaps, which looks terrible as it destroys all the small features that make text legible.

Why the lack of advancement? Maybe it's just too much work and too much risk for too little gain. Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering? It would be a daunting task. Rendering glyphs is one thing but how about handling line breaking? Seems like it would require a lot of communication between CPU and GPU, which is slow, and deep integration between the software and the GPU, which is difficult.

meindnoch - 4 days ago

Impressive work!

But subpixel AA is futile in my opinion. It was a nice hack in the aughts when we had 72dpi monitors, but on modern "retina" screens it's imperceptible. And for a teeny tiny improvement, you get many drawbacks:

- it only works over opaque backgrounds

- can't apply any effect on the rasterized results (e.g. resizing, mirroring, blurring, etc.)

- screenshots look bad when viewed on a different display

EnPissant - 4 days ago

It's important to point out that SDFs compute a pixel distance to the closest edge, while a more traditional font renderer computes pixel coverage. Pixel coverage is optimal. For small fonts, SDFs can look bad in places where edges meet. Maybe this is less of an issue on high PPI displays. Source: I implemented a SDF renderer and it looked worse than freetype.

tuna74 - 4 days ago

To all people that want sub-pixel rendering: Unless you know the sub-pixel grid on the display it is going to look worse. Therefore the only good UX that you can do is to ask the user for every display they use if they want to turn it on for that specific display. The OS also have to handle rotations etc as well.

adamrezich - 4 days ago

Nobody here seems to have noticed but the “pseudocode” in the article is in fact Jai code, which you can tell by the `xx` in

    base_slot_coordinates := decode_morton2_16(xx index);
which in Jai means “autocast”.
rossant - 4 days ago

I can't find the link to the code is it available?

cchance - 4 days ago

After seeing the cursive all i immediately thought was "who the fuck ever thought cursive was a good idea" lol

pjmlp - 4 days ago

While the article is great, I am missing a WebGL/WebGPU demo to go along the article, instead of videos only.

z3t4 - 4 days ago

When making a text editor from scratch my biggest surprise was how slow/costly text rendering is.

favorited - 4 days ago

I watched a conference talk[0] about using MSDFs for GPU text rendering recently, really interesting stuff!

[0] https://www.youtube.com/watch?v=eQefdC2xDY4

Bengalilol - 4 days ago

Amazing read, I am so envious of being able to go down such "holes".

As a side note, from the first "menu UI" until the end, I had the Persona music in my head ^^ (It was a surprise reading the final words)

neurostimulant - 4 days ago

I wonder if editors that use gpu text rendering like Zed would use something like this to improve their text rendering. Or maybe they already do?

dustbunny - 4 days ago

This is incredibly well written, interesting and useful.

jbrooks84 - 4 days ago

Love high ppi retina displays for crispy text

b0a04gl - 4 days ago

if we can stream video textures to gpu in real time, why can’t we stream sdf glyphs the same way? what makes text rendering need so much prep upfront?

eptcyka - 4 days ago

Would be great if the videos in the article were muted so that iOS didn’t stop playing my music whilst reading this.

exDM69 - 4 days ago

Very cool stuff, text rendering is a really hairy problem.

I also got nerd sniped by Sebastian Lague's recent video on text rendering [0] (also linked to in the article) and started writing my own GPU glyph rasterizer.

In the video, Lague makes a key observation: most curves in fonts (at least for Latin alphabet) are monotonic. Monotonic Bezier curves are contained within the bounding box of its end points (applies to any monotonic curve, not just Bezier). The curves that are not monotonic are very easy to split by solving the zeros of the derivative (linear equation) and then split the curve at that point. This is also where Lague went astray and attempted a complex procedure using geometric invariants, when it's trivially easy to split Beziers using de Casteljau's algorithm as described in [1]. It made for entertaining video content but I was yelling at the screen for him to open Pomax's Bezier curve primer [1] and just get on with it.

For monotonic curves, it is computationally easy to solve the winding number for any pixel outside the bounding box of the curve. It's +1 if the pixel is to the right or below the bounding box, -1 if left or above and 0 if outside of the "plus sign" shaped region off to the diagonals.

Further more, this can be expanded to solving the winding number for an entire axis aligned box. This can be done for an entire GPU warp (32 to 64 threads): each thread in a warp looks at one curve and checks if the winding number is the same for the whole warp and accumulate, if not, set a a bit that this curve needs to be evaluated per thread.

In this way, very few pixels actually need to solve the quadratic equation for a curve in the contour.

There's still one optimization I haven't done: solving the quadratic equation in for 2x2 pixel quads. I solve both vertical and horizontal winding number for good robustness of horizontal and vertical lines. But the solution for the horizontal quadratic for a pixel and the pixel below it is the same +/- 1, and ditto for vertical. So you can solve the quadratic for two curves (a square root and a division, expensive arithmetic ops) for the price of one if you do it for 2x2 quads and use warp level swap to exchange the results and add or subtract 1. This can only be done in orthographic projection without rotation, but the rest of the method also works in with perspective, rotation and skew.

For a bit of added robustness, Jim Blinn's "How to solve a quadratic equation?" [2] can be used to get rid of some pesky numerical instability.

I'm not quite done yet, and I've only got a rasterizer, not the other parts you need for a text rendering system (font file i/o, text shaping etc).

But the results are promising: I started at 250 ms per frame at a 4k rendering of a '@' character with 80 quadratic Bezier curves, evaluating each curve at each pixel, but I got down to 15 ms per frame by applying the warp vs. monotonic bounding box optimizations.

These numbers are not very impressive because they are measured on a 10 year old integrated laptop GPU. It's so much faster on a discrete gaming GPU that I could stop optimizing here if it was my target HW. But it's already fast enough for real time in practical use on the laptop because I was drawing an entire screen sized glyphs for the benchmark.

[0] https://www.youtube.com/watch?v=SO83KQuuZvg [1] https://pomax.github.io/bezierinfo/#splitting [2] https://ieeexplore.ieee.org/document/1528437

elia_42 - 4 days ago

Really interesting!

pimlottc - 4 days ago

“Crisp text” would be more accurate, I thought maybe this was going to be about rendering intentionally degraded text, like in memes

fatih-erikli-cg - 4 days ago

[dead]

moralestapia - 4 days ago

[flagged]

enriquto - 4 days ago

> You might want to place the glyph at any position on the screen, not necessarily aligned with the pixel grid

No. I don't. This is a horrifying concept. It implies that the same character may look different every time is printed! This is extremely noticeable and really awful. For example when you align equal signs on consecutive lines of code, you notice straight away whether the characters are different.

Nowadays pixels are so small that I don't understand why don't we all just use good quality bitmap fonts. I do, and couldn't be happier with them. They are crisp to a fault, and their correct rendering does not depend on the gamma of the display (which is a serious problem that TFA does not even get into).