macOS Container Machines
github.com1072 points by timsneath 17 hours ago
1072 points by timsneath 17 hours ago
To clarify a few comments here: this is not only OCI containers: container machines add support for persistence and filesystem mounting, making container machines a great lightweight Linux environment for developers using macOS. More details here: https://developer.apple.com/videos/play/wwdc2026/389
> container runs containers differently. Using the open source Containerization package, it runs a lightweight VM for each container that you create. This approach has the following properties:
> - Security: Each container has the isolation properties of a full VM, using a minimal set of core utilities and dynamic libraries to reduce resource utilization and attack surface.
> - Privacy: When sharing host data using container, you mount only necessary data into each VM. With a shared VM, you need to mount all data that you may ever want to use into the VM, so that it can be mounted selectively into containers.
> -Performance: Containers created using container require less memory than full VMs, with boot times that are comparable to containers running in a shared VM.
More details, including technical limitations (they’re looking for bug reports and contributions): “Container: Technical Overview” https://github.com/apple/container/blob/main/docs/technical-...
> ... highly integrated Linux environment that works seamlessly on your Mac. ...
Which kernel is running, and is it hosted in hypervisor.framework, as is done with UTM (when not using the qemu mode)?
> filesystem mounting
How is this different to bind mounts
Very different: Linux running in a virtual machine can't bind mount into a macOS host's filesystem. So they use virtiofs.
MacOS container filesystem/IO has been bog slow preventing even some basic dev container use cases. Hopefully this will fix the issue.
It's not substantially different from previous approaches (9pfs vs. virtiofs).
My suggestion: Don't use the host filesystem from the guest at all. It'll be faster, and better isolated. It's a false convenience.
This applies to both containers and container machines though, right?
Containers (those popularised on Linux by Docker) are built on Linux primitives like cgroups and namespaces, so they're running directly on the same kernel, same VFS, often the same FS, etc. Their isolation properties rely on (a) all those Linux features working as expected, and (b) the container runtime setting them up properly.
Depending on your threat model, that's fine, but a lot of people (including me) will say that containers are not a security mechanism.
But macOS requires[1] virtualisation for containers anyway; the security is just a bonus.
[1] at least for a real Linux kernel...
The surface of an OS is definitely larger than that of many hypervisors, which is e.g. why browsers often provide their own much narrower sandbox.
On the other hand, in other scenarios, people trust the security boundaries of their working as expected all the time, no? This is the basis of e.g. Android app isolation (every app runs under its own Linux UID/GID), and true multi-user Unix systems trusting the OS's security boundaries to hold have decades of history.
Different threat models. Your typical Android device (and Linux server for that matter, at home or at scale) is not usually running security-sensitive general workloads for multiple tenants in the same OS instance. :-)
Ah, the Darwin/BSD Subsystem for Linux.
Not quite, it’s still a VM. And while it supports virtio balloon for growing RAM, it doesn’t yet support releasing that RAM back to the host. And there isn’t a convenient way to shrink the sparse disk images as they grow yet, either.
And a limited VM, for example I look at the documentation and it's not possible to share USB devices with the VM, making it perfectly useless for doing embedded development where you have to connect to the boards with USB. I will continue to use UTM for that reason...
Virtualization.framework just gained USB passthrough support in macOS 27. It might be a niche feature for containers to add, but other VM software will likely add support soon.
Isn't the Windows subsystem for Linux (the reference there) also a VM?
Only WSL2; WSL1 was an actual subsystem.
WSL1 was so cool, WSL2 made it boring and isolated.
WSL1 was very conceptually appealing, and ended up working very poorly because of the poor matching between Linux syscalls and the Windows kernel. Git suffered terribly as a result. The inverse is also somewhat true - there have been cases where Wine is much slower than native Windows because Linux simply doesn't provide a simple way to achieve the same outcome, and interestingly the Wine developers have had reasonable (if tediously slow) success in making it possible to express the same semantics to Linux and have it handle things fast. It would be fascinating to know whether WSL1 developers didn't have enough traction to get Windows internals altered to match, or whether it's just way harder to do the same under Windows.
It did work quite well. The problem with the filesystem could have been solved by optimizing the Windows kernel, that would have benefit also programs run outside the WSL by the way (NTFS have performance problems and Microsoft knows, and even provided a kind of solution as far as I know with the developer FS or what they call it).
The thing that I don't like of the WSL2 is that is just a VM, but a VM that is very limited. For example working in the embedded development field I often need to use serial ports or USB devices, a thing that the WSL2 is not capable of doing (unless passing trough USB/IP that has its compatibility issues especially for stuff like debuggers needing precise timing), and that the WSL1 was at least for the serial ports able to do. This is a limitation that doesn't allow me to use the WSL. Same thing with all kind of other software that wants to access peripherals of the machine natively (e.g. a GPU for example, or another PCI card, something that to be fair is not even doable as far as I know with hypervisors on Windows but completely doable with hypervisors running on a Linux OS where trough the IO MMU you can share any PCI device of the host to the VM).
WSL1 was a great idea, bad thing that Microsoft abandoned it for something that is just good for web application development.
> (NTFS have performance problems and Microsoft knows, and even provided a kind of solution as far as I know with the developer FS or what they call it)
NTFS does not have performance problems. The difference between DevDrive, which uses ReFS (arguably a more 'resilient' file system than NTFS due to journaling) and a standard NTFS volume is the file system filters are either removed or in the case of Defender, put in async mode.
The file system filter architecture is the performance problem, not the file systems. It's a trade off to have a more extensible I/O stack.