What I learned designing a barebones UI engine
madebymohammed.com58 points by teleforce 11 hours ago
58 points by teleforce 11 hours ago
Immediate mode GUI is the way to go.
Retaining state is a pain and causes bugs. Trying to get fancy a la react and diffing the tree for changes makes not sense. That was a performance hack because changing the DOM in JS used to be slow as hell. You don't need that.
Just redraw the whole thing every frame. Great performance, simple, less bugs.
This works for simple apps, utilities, and demos/mvps. Not great for actual applications.
What about when you're embedding your GUI into an existing application? or for use on an already taxed system? (Audio plugins come to mind)
What if something is costly, that you need to compute dynamically, but not often, makes it into the frame? Do you separately now create a state flag for that one render object?
> What if something is costly, that you need to compute dynamically, but not often, makes it into the frame? Do you separately now create a state flag for that one render object?
The point of immediate mode UIs is not necessarily that there is no state specific to the UI, but rather that the state is owned by user code. You can (and, in these more complex cases, should) retain state between frames. The main difference is that the state is still managed by your code, rather than the UI system ("library", whatever).
Immediate mode UI optimizes for the worst case. That is the case you care about most for real time applications.
Retained mode is more optimal when not much changes but if a lot of stuff changes at once it can be worse. So for real time applications like your audio example or games you want immediate mode. Retained mode is better for saving battery though or can be.
For interest sake, have a look at the flutter engine. It does this kind of diff on each build (meaning, each time the ui tree gets modified & triggers a rebuild); they split their objects into stateful & stateless, and then in your own code you have to make sure to not unnecessarily trigger rebuilds for expensive objects. So it kinda force you to think & separate cheap & expensive ui objects.
> Do you separately now create a state flag for that one render object?
That can be a reasonable choice sometimes. Note that the point is that you introduce state where necessary, rather than stateful UI being the default as with retained mode.
> Just redraw the whole thing every frame. Great performance, simple, less bugs.
And in low power applications? Like on a smartphone?
When the UI is highly dynamic/animated it needs to be redrawn each frame also in a 'retained mode' UI framework.
When the UI is static and only needs to change on user input, an immediate mode UI can 'stop' too until there's new input to process.
For further low-power optimizations, immediate mode UI frameworks could skip describing parts of the UI when the application knows that this part doesn't need to change (contrary to popular belief, immediate mode UI frameworks do track and retain state between frames, just usually less than retained mode UIs - but how much state is retained is an internal implementation detail).
The problem is that widgets still need to store state somewhere, and that storage space needs to be reclaimed at some point. How does the system know when that can be done? I suppose the popular approach is to just reclaim space that wasn't referenced during a draw.
However ...
When you have a listbox of 10,000 rows and you only draw the visible rows, then the others will lose their state because of this.
Of course there are ways around that but it becomes messy. Maybe so messy that retained mode becomes attractive.
> How does the system know when that can be done?
At the earlist in the first frame the application UI description code doesn't mention an UI item (that means UI items need a persistent id, in Dear ImGui this is a string hash, usually created from the item's label which can have a hidden `##` identifier to make it unique, plus a push/pop-id stack for hierarchical namespacing.
> then the others will lose their state because of this
Once an item is visible, the state must have been provided by the application's UI description code, when the item is invisible, that state becomes irrelevant.
> when the item is invisible, that state becomes irrelevant.
What happens when the item moves out of view, e.g. because the user scrolls down?
State should be preserved, because the user might scroll back up.
The job of the immediate UI is to just draw the things. Where and how you manage your state is completely up to you.
It seems you assume some sort of OO model.
> When you have a listbox of 10,000 rows and you only draw the visible rows, then the others will lose their state because of this.
Well keep the state then.
Immediate mode really just means you have your data as an array of things or whatever and the UI library creates the draw calls for you. Drawing and data are separate.
> The job of the immediate UI is to just draw the things. Where and how you manage your state is completely up to you.
This is a bit oversimplified. For instance Dear ImGui needs to store at least the window positions between frames since the application code doesn't need to track window positions.
> And in low power applications? Like on a smartphone?
Doesn't make a difference. If the page is static, there is no redraw happening. If the page is dynamic, the redraw is happening at the frequency of the change (once per second, or once per frame, or whatever).
Whether you're doing a diff of the DOM or redrawing the whole DOM, typical pages (i.e. not two-sigmas past the median) aren't going to redraw something on every frame anyway.
That really depends on the kind of user interface no?
If you just have a lot of text and a few rectangles and no animation, immediate mode would work well...
But if you have a lot of images, animation etc ... You'd anyway have to track all the textures uploaded to the GPU to not reupload them. Might as well retain as much of the state as possible? (Eg. QtQuick)
Isn't it the other way around?
The more dynamic/animated an UI is, the less there's a difference between a retained- and immediate-mode API, since the UI needs to be redrawn each frame anyway. Immediate mode UIs might even be more efficient for highly dynamic UIs because they skip a lot of internal state update code - like creating/destroying/showing/hiding/moving widget objects).
Immediate-mode UIs can also be implemented to track changes and retain the unchanged parts of the UI in baked textures, it's just usually not worth the hassle.
The key feature of immediate mode UIs is that the application describes the entire currently visible state of the UI for each frame which allows the UI code to be 'interleaved' with application state changes (e.g. no callbacks required), how this per-frame UI description is translated into pixels on screen is more or less an implementation detail.
> The more dynamic/animated an UI is, the less there's a difference between a retained- and immediate-mode API, since the UI needs to be redrawn each frame anyway. Immediate mode UIs might even be more efficient for highly dynamic UIs because they skip a lot of internal state update code - like creating/destroying/showing/hiding/moving widget objects).
That depends on the kind of animations - typically for user interfaces, it's just moving, scaling, playing with opacity etc.. that's just updating the matrices once.
So you describe the scene graph once (this rectangle here, upload that texture there, this border there) using DOM, QML etc..., and then just update the item properties on it.
As far as the end user/application developer is concerned , this is retained mode. As far as the GPU is considered it can be redrawing the whole UI every frame..
> it's just moving, scaling, playing with opacity etc.. that's just updating the matrices once.
...any tiny change like this will trigger a redraw (e.g. the GPU doing work) that's not much different from a redraw in an immediate mode system.
At most the redraw can be restricted to a part of the visible UI, but here the question is whether such a 'local' redraw is actually any cheaper than just redrawing everything (since figuring out what needs to be redrawn might be more expensive than just rendering everything from scratch - YMMV of course).
It's not about what gets redrawn but also how much of the UI state is still retained (by the GPU). Imagine having to reupload all the textures, meshes to the GPU every frame.
Something like a lot of text ? Probably easier to redraw everything in immediate mode.
Something like a lot of images just moving, scaling, around? Easier to retain that state in GPU and just update a few values here and there...
> Easier to retain that state in GPU and just update a few values here and there
It's really not that trivial to estimate, especially on high-dpi displays.
Rendering a texture with a 'baked UI' to the framebuffer might be "just about as expensive" as rendering the detailed UI elements directly to the framebuffer.
Processing a pixel isn't inherently cheaper than processing a vertex, but there are a lot more pixels than vertices in typical UIs (a baked texture might still win when there's a ton of alpha-blended layers though).
Also, of course you'd also need to aggresively batch draw calls (e.g. Dear ImGui only issues a new render command when the texture or clipping rectangle changes, e.g. a whole window will typically be rendered in one or two draw calls).
What are you even basing this on? I did an experiment a few days ago, where a text input on a web page reacts to text input. Once it hits a certain count, the colour changes, like a validation error.
In the initial crappy implementation the code was assigning the same class over and over to the text input, rather than only when required. Despite that being an obvious bug, I could literally feel the difference in typing speed and how that was hammering the page.
Once the bug was fixed, and it only assigned it once correctly, the problem went away.
"redraw everything the whole frame" and "don't do any diffing" sound insane in this regard.
> "redraw everything the whole frame" and "don't do any diffing" sound insane in this regard.
You need to consider that a web browser with its millions of lines of code in the DOM and rendering engine is pretty much the worst case for "redrawing a complex UI each frame". Add React on top and the whole contraption might still be busy with figuring out what has changed and needs to be redrawn at the time an immediate mode UI sitting directly on top of a 3D API is already done rendering the entire UI from scratch.
A native immediate mode UI will easily be several hundred times less code (for instance Dear ImGui is currently just under 50kloc 'orthodox C++').
> While it’s far from perfect, writing it taught me more about UI systems than I ever would have learned by sticking to established solutions alone.
This is a great attitude to have. Keep up the great work.
Had a similar itch during my game development with libgdx, and had almost same architecture eventually
I found that I have two different ways to construct UI layout , from top down, and from down to top, those could be contradictory, wonder how one could solve this, seems like common problem in all frameworks that I saw, like flutter just fail with error on screen if it can't solve restrictions in such conflict , others just show jiberish
rectcut is a good API for layout if you have a fixed viewport (eg eink display)
the API is a very simple one where you slice parts off an initial Rect. the only feature it provides is that it tracks (x, y, h, w) for you.
it doesn't work well with intrinsic sizes - it's more of a top down, fixed size thing.
I actually use a very similar paradigm successfully in a game [1] whose (immediate-mode) UI is fully responsive. I allow more operations than just cutting to do that, but the basic idea seems to be the same. The code may look like a bit of a mess at a first glance [2], but I still find it easier to work with and make it do what I actually want with some very basic vector maths, than with the layout-container rules of most UI frameworks.
[1]: https://fruitsandtails.fghj.cz/
[2]: https://codeberg.org/spiffyk/FruitsAndTails/src/branch/main/...
author here - thanks for posting :D would love to hear any thoughts or questions.
At what point do you think you’d give in and use an existing solution?
I’m going through similar “hell” (my words). I just wanted some simple Ui for WebXR but the dipshits that designed XR for the web fucking threw all the web parts out so you can not just put up a few simple html elements up in XR. You have to write your own UI library from scratch. It’s so mind bogglingly stupid.
In any case, having to write it, like you I started small and the it quickly ballooned because even simple things get complicated quickly, all the while I’m cursing under my breath there is a perfectly unable system but TPTB chose not to offer it >:(
existing solutions for my specific use case were limited, i intentionally wanted something barebones so that I could hack at the raw surfaces / display buffers to cut corners for performance. one part learning experience, one part stubbornness.
WebXR sounds like a different beast entirely. do you have to write your own rendering backend in WebGL for that?
Webgl or webgpu, but for ui you can just draw to a canvas and copy that canvas into webgl/webgpu. That means you can easily iterate on the ui system in 2d with mouse input similar to what you’ve been doing in python
But, as an example of similar complexity. I hacked together the first Ui I needed. It was 6 buttons and a slider. Then I realized I needed two more Ui panels and started to write more real Ui classes to abstract stuff out and, it just starts getting more and more complicated.
A slider needs to “capture the pointer”, meaning as you drag it, if to drag past the end of the slider and the pointer is now over another widget the events should still go to the slider. It’s not hard to implement, it’s just frustrating for me that I don’t actually want to write it. I just want to make my panel and if I could just use html id be done. Instead I’m spending my limited free time making this ui system. I don’t have that much and would be much further along in my actual goals except for this road block