How fast is a macOS VM, and how small could it be?

249 points by moosia a day ago

>Starting with 4 virtual cores and 8 GB vRAM, where the VM ran perfectly briskly with around 5 GB of memory used, I stepped down to 3 cores and 6 GB, to discover that memory usage fell to 3.9 GB and everything worked well. With just 2 cores and 4 GB of memory only 3.1 GB of that was used, and the VM continued to handle those lightweight tasks normally.

Good reminder that there's a certain amount of memory tied up with each core (probably mainly page cache and concurrency handling etc).

adrian_b - 20 hours ago

As a general rule, also the amount of physical memory installed in a computer should be proportional with the number of hardware threads provided by its CPU.
Besides the fact that the operating system may allocate some memory for each thread, when you launch a multi-threaded application that is able to use all available threads, for instance the compilation of a big software project, it frequently will allocate some working memory in an amount proportional with the amount of working threads.
I have encountered many multi-threaded applications that need up to 2 GB per thread to work well.
This corresponds to having 64 GB for a desktop CPU with 32 threads, like Ryzen 9 9950X.
For the compilation example, I have seen software projects, like Chrome/Chromium and its derivatives, where if you do not have enough memory, proportional to the number of hardware threads, e.g. when you have only 32 GB for a 16 core/32 thread CPU, you must reduce the number of concurrent compilations, e.g. with an appropriate parameter to "make -j", leaving some threads and cores idle, because otherwise you may encounter out-of-memory errors.
- embedding-shape - 17 hours ago
  
  Compiling flash-attn (Flash Attention) is a another great stress-test for CPU+RAM as just using 16 threads can balloon you into 128GB RAM usage territory already. Same thing with needing to not do too much concurrency when compiling it.
  - cjbgkagh - 17 hours ago
    
    I have this problem with NixOS as one of my build servers doesn’t have enough ram. There doesn’t seem to be a way to know if a compilation is likely to be ram heavy and either use a tagged server with more ram or use few threads on servers with less ram.
- - 4 hours ago
  
  [deleted]
- Neywiny - 11 hours ago
  
  It's an important point. I went from 4c/8t and 32GB to 16/32 and 96GB. Dramatically less memory per thread. Some software (looking at you, Vivado) can take incredible amounts of memory per parallel job thus mandating some projects can only run with a subset of my cores. At least until I stepped up my work laptop to 10.66 GB/thread. That seems to be manageable
- realo - 19 hours ago
  
  Yes! I have also observed that with compilation VMs on a big server.
fulafel - 21 hours ago

I'd bet for the null hypothesis: the memory behaviour changes would hold if the core count was kept constant and only the VM's memory size was adjusted.
- brookst - 21 hours ago
  
  Agreed. This is the OS adapting to available memory.
  Similarly if you started with 4GB and there was 900MB available for user apps, I expect you could launch apps that consume 1500MB just fine; the OS is leaving enough to launch anything, and making use of unused memory for cache/etc.
- dmitrygr - 19 hours ago
  
  There is a per-cpu data structure in the xnu kernel, but it is not big enough to tilt the scales when you are talking about RAM in units of gigabytes.
  - pdpi - 18 hours ago
    
    It’s not just the kernel. I wouldn’t be surprised if there’s a fair few userspace services spawning a thread per core.
wutwutwat - 21 hours ago

There is some overhead per-core, you're right, but imo this reduction in usage is likely from how the kernel allocates available memory, which is being reduced as well. The kernel will keep read caches around longer with more memory, it'll prefer to compress memory instead of swap to disk if it has more, it'll purge/cleanup reclaimable memory less often with more memory, etc. It even scales its internal buffer sizes and vnode tables depending on total memory.
All good things imo, it dynamically makes the most of what is available, at the expense of making it harder to see a true baseline of hard min requirement to operate.
Fun things to check, `vm_stat`
$ vm_stat Mach Virtual Memory Statistics: (page size of 4096 bytes)
Pages free: 230295.
Pages active: 1206857.
Pages inactive: 1206361.
Pages speculative: 31863.
Pages throttled: 0.
Pages wired down: 470093.
Pages purgeable: 18894.
"Translation faults": 21635255.
Pages copy-on-write: 1590349.
Pages zero filled: 11093310.
Pages reactivated: 15580.
Pages purged: 50928.
File-backed pages: 689378.
Anonymous pages: 1755703.
Pages stored in compressor: 0.
Pages occupied by compressor: 0.
Decompressions: 0.
Compressions: 0.
Pageins: 832529.
Pageouts: 225.
Swapins: 0.
Swapouts: 0.
edit: no code fence markdown support or am I doing something wrong?
- schrodinger - 17 hours ago
  Single inline backticks like `this` aren't recognized (although still useful in my opinion, they just don't change the rendering).
  Triple backticks also aren't recognized. However, if you indent by I believe 4 spaces, it formats it in a fixed width font presuming it's code.
  Let's try (4 spaces):
  func main() { fmt.Println("Hello, HN!") }
  None for comparison:
  func main() { fmt.Println("Hello, HN!") }
  - wutwutwat - 11 hours ago
    
    Seems I missed the window to be able to edit my message, but I'll remember this info for next time, thanks!

Havoc - a day ago

Got a M5 air recently - my first dive into MacOS land so trying to figure this out too.

Seems essentially impossible to get:

* pytorch

* GPU acceleration

* VM/container like isolation

The virtio-gpu layer gets closest but seems to only pass through graphics GPU not compute GPU so no pytorch

plufz - a day ago

I need this too, and looked quite a lot on it a year ago. I haven’t had time to check out the recent developments with Docker Model Runner (vllm-metal) or podman libkrun. Did neither of those work for you?
- Havoc - 21 hours ago
  
  vllm-metal isn't GPU access but rather a openai compatible end point which I can already do via lm studio endpoint over network
  >podman libkrun
  Haven't tried it but research suggests its really shaky still. podman libkrun exposes vulkan while torch expects mps on macs. Sounds like one can force vulkan but that's apparently slow and beta-ish?
emmelaich - 21 hours ago

I got torch to run in a Cirruslabs Tart instance.
- Havoc - 17 hours ago
  
  By "Instance" do you mean their cloud platform?
  - emmelaich - 8 hours ago
    
    Nah, just locally on my macair.
    TBF, I only got to the point that using device=mps_device didn't fail. I used Sonoma at the time and the image for the vm was ghcr.io/cirruslabs/macos-sequoia-xcode:16.2-beta-3. Python 3.12, as well, because torch didn't work with later versions.
    import torch mps_device = torch.device("mps") print('device is', mps_device) x = torch.ones(1, device=mps_device) print(x)
  - adastra22 - an hour ago
    
    brew install tart
binsquare - 17 hours ago

[dead]

mgaunard - a day ago

My only experience with VMs on macOS is colima+docker, and it's relatively painful and inefficient (but usable).

woadwarrior01 - 21 hours ago

Try Apple's container CLI. I moved a project of mine from colima+docker to it relatively easily, a couple of weekends ago.
https://github.com/apple/container
- highpost - 17 hours ago
  
  Here's an example of how to build a simple Alpine Linux container using Apple's containerization CLI. It also demonstrates how to connect to the container through Tailscale SSH using a Tailscale auth key stored in Apple Keychain:
  https://github.com/highpost/tailscale-macos-container
- sagarm - 17 hours ago
  
  Does this project aim for docker cli and api compatibility? Searching for Docker on that page yields no results. Though in their example, they do show an example of a Dockerfile referencing docker.io without shame.
  Typical Apple behavior, I guess, but grating to see in a OSS tool.
  - - 16 hours ago
    
    [deleted]
  - troad - 8 hours ago
    
    This is a weird take, imho. Should they feel shame for using Dockerfiles in their OCI-standard-compliant tool? Would you be happier if they introduced subtly incompatible Applefiles?
    Why are they obliged to emulate the Docker CLI? This limits them to just shadowing someone else's product. Just use Docker if you want their CLI/API, it uses the same virtualization framework under the hood on Macs.
- copperx - 16 hours ago
  
  I'm curious to know what kind of project is macOS exclusive?
  - troad - 8 hours ago
    
    You're surprised that a project by Apple Inc that is basically a wrapper around the Mac virtualisation framework [0] is Mac exclusive?
    [0] https://developer.apple.com/documentation/virtualization
- pram - 15 hours ago
  
  container is really good, ive been using it to sandbox some CLI tools and it starts up in less than a second
- ngai_aku - 14 hours ago
  
  AFAIK no support for Compose though
- yokoprime - 19 hours ago
  
  Thank you for this, will check it out!
embedding-shape - a day ago

Recently got a Mac Mini for local CI purposes (together with Forgejo Actions), took a broad look at the ecosystem and decided to just roll with "build on host" instead. Setting up signing/notarization just looked like an insurmountably task together with isolating it from the host, even with agents. At least the macOS builds are really fast now and the signing/notarization just ~200 lines of Bash...
- latexr - a day ago
  
  > the signing/notarization just ~200 lines of Bash
  200 lines?! That’s two orders of magnitude too many. What exactly are you doing that you need so such code for signing and notarisation?
  - embedding-shape - 20 hours ago
    
    From the top of my head, unlocking the keychain, finding the right identity, notarizing two parts, the binary itself and the .dmg that the .app ships in and some other stuff I'm sure. Can do a deeper look in a bit when I can. Most of the hassle is because it's 100% unattended and I had to do stuff to avoid GUI-prompts for passwords/unlocks, and that the Forgejo Runner has a different security context.
    
    latexr - an hour ago
    
    > unlocking the keychain, finding the right identity
    You don’t need to do that, you can give options to the CLI to define what profile to use.
    > Most of the hassle is because it's 100% unattended and I had to do stuff to avoid GUI-prompts for passwords/unlocks
    I have a shell function to which I point my code and it compiles, signs, and notarises it without any more intervention, GUI or password prompts, and I’m pretty sure signing and notarising are literally two lines.
    Unfortunately I’m not at my computer now or I’d paste them, but from your description that script is definitely too long.
    
    saagarjha - a minute ago
    
    I assume you're using notarytool but I doubt that it will work unless you have your keychain unlocked
    
    hamandcheese - 19 hours ago
    
    This matches my experience. Keychain + fully unattended increases the complexity and adds a bunch of landmines that need to be dodged (e.g. GUI prompts like you mentioned).
- yohannparis - a day ago
  
  Could you share your recipe please ? I’m interested
isityettime - 20 hours ago

OrbStack is pretty good. I don't find it inefficient, really.
- CraigJPerry - 16 hours ago
  
  OrbStack is impressive on the performance and energy efficiency fronts. I'm not aware of anything that comes close. But they're doing something funky under the covers. You can't just start any OS in a VM. It has to be somehow mangled to suit their VM. Thankfully NixOS is available so I'm fine for my use cases. It's still remarkable how efficient it is.
  - isityettime - 15 hours ago
    
    Yeah, it's like WSL. It starts just one VM and then your individual "machines" are LXC containers underneath. If you peek at the vendor-supplied file your NixOS OrbStack Machine includes you can see some of it.
    They're constantly doing other optimizations in other ways, too. But that's the one you were pointing at, I think.
    
    mgaunard - 14 hours ago
    
    That's also what Colima does.
    OrbStack isn't open-source though and I can't justify buying a license for every single person in my company just for something functionally equivalent but performing better.
    These kinds of things should just be provided by Apple as a first-class thing.

dhruv3006 - a day ago

https://github.com/trycua/cua/tree/main/libs/lume had a interesting take on this.

nottorp - a day ago

> Starting with 4 virtual cores and 8 GB vRAM, where the VM ran perfectly briskly with around 5 GB of memory used

But... if you start applications inside your VM it will want the full 8 Gb you've allocated not the 5 Gb it uses at startup?

stingraycharles - a day ago

I don’t assume that macOS virtualization is advanced enough to support memory ballooning, or is that not what you’re referring to?
Edit: I stand corrected!
- pyth0 - a day ago
  
  I don't assume anything either, but a single Google search is enough to dispel that [1]
  [1] https://developer.apple.com/documentation/virtualization/vzv...
- sgt - a day ago
  
  macOS is generally pretty amazing at efficient memory usage and VM (virtual memory subsystem) handling. So even a 8GB machine can run pretty impressive workloads without having the user think the machine is underpowered.
  - stingraycharles - 20 hours ago
    
    Important caveat: that’s mostly the case for desktop workloads when you’re multitasking a lot, and not as much for server workloads where you actually need all memory.
  - p_ing - a day ago
    
    Not really. Larger page sizes mean more potential for wasted memory and it has had a long standing memory leak in some core component to where even Calculator can cause an OOM event.
    
    jdiff - 21 hours ago
    
    GP is pretty accurate in my experience. Up until last year I was still running an Intel MacBook Pro with 8GB of RAM and successfully multitasked with Blender, Illustrator, Unity, VS Code, and Firefox quite often. The math doesn't make sense, but all stayed responsive even with frequent hops between them. The only OOM events I ran into were memory leaks from Firefox, I believe from an extension.
    
    p_ing - 21 hours ago
    
    There's nothing particularly interesting about that. Linux distro-of-your-choice can run the equivalents fine, as can Windows.
    Browse /r/macos if you dare to wade into the uninformed cesspool; it's full of OOTB apps causing OOMs (among 3rd party apps) with the past at least two major versions of macOS.
    
    jdiff - 20 hours ago
    
    I think there is something interesting there. I was running lighter workloads on similar RAM when I daily drove Debian and was frequently brought to my knees by swapping to death. I had to make conscious choices and manage my RAM usage to avoid it, and still occasionally got T-boned by something I overlooked. I have never had to worry about that with macOS.
    I admit I don't have much experience with how Windows handles constrained memory since XP, and XP was abysmal at it just by virtue of being far more bloated than an equivalent Linux distro. It's certainly far more bloated nowadays, but maybe it handles memory pressure better.
    None of this should be construed to say that macOS doesn't have serious issues or that it's not in dire need of a Snow Leopard-esque "0 new features" release. That's tangential to its memory handling, where I haven't seen the issues you describe.
    
    p_ing - 20 hours ago
    
    Even NT4 handles memory pressure than modern day Linux. It's just not a fair comparison; Linux has never dealt with userspace OOM well.
    As for macOS...
    https://old.reddit.com/r/MacOS/comments/1njf1aj/bravo_apple_...
    https://old.reddit.com/r/MacOS/comments/1nxh08n/impressive_m...
    https://old.reddit.com/r/MacOS/comments/1jo5pnq/passwords_ap...
    https://old.reddit.com/r/MacOS/comments/1gkwxe4/how_is_memor...
    https://old.reddit.com/r/MacOS/comments/1seq0ij/freeform_has...
    There are _plenty_ more. There is some fundamental library leaking given the range of impacted apps.
    
    sgt - 19 hours ago
    
    Seeing there are thousands running those apps (incl. Freeform) without memory leaks, it could be something else at play here.