Parallel ./configure
tavianator.com219 points by brooke2k 8 months ago
219 points by brooke2k 8 months ago
The other issue is that people seem to just copy configure/autotools scripts over from older or other projects because either they are lazy or don't understand them enough to do it themselves. The result is that even with relatively modern code bases that only target something like x86, arm and maybe mips and only gcc/clang, you still get checks for the size of an int, or which header is needed for printf, or whether long long exists.... And then the entire code base never checks the generated macros in a single place, uses int64_t and never checks for stint.h in the configure script...
I don't think it's fair to say "because they are lazy or don't understand". Who would want to understand that mess? It isn't a virtue.
A fairer criticism would be that they have no sense to use a more sane build system. CMake is a mess but even that is faaaaar saner than autotools, and probably more popular at this point.
I took the trouble (and even spent the money) to get to grips with autotools in a structured and detailed way by buying a book [1] about it and reading as much as possible. Yes, it's not trivial, but autotools are not witchcraft either, but as written elsewhere, a masterpiece of engineering. I have dealt with it without prejudice and since then I have been more of a fan of autotools than a hater. Anyway, I highly recommend the book and yes, after reading it, I think autotools is better than its reputation.
Autotools use M4 to meta-program a bash script that meta-programs a bunch of C(++) sources and generates C(++) sources that utilizes meta-programming for different configurations; after which the meta-programmed script, again, meta-programs monolithic makefiles.
This is peak engineering.
Yes, that sound ridiculous, but it is that way, so that the user can modify each intermediate step, which is the main selling point. As a user I really prefer that experience, which is why I as a developer put up with the non-sense of M4. (Which I think is more due to M4 being a macro language, then inherent language flaws.)
Sounds like a headache. Is there a nice Python lib to generate all this M4-mumbo-jumbo?
"Sounds complicated. I want it to throw exceptions and have significant whitespace on top of all that complexity!"
Oh it has significant white space. Make generally doesn't handle paths with spaces, so if you put the build or source directory somewhere where the absolute path has a space, all bets are off.
autotools is the worst, except for all the others.
I'd like to think of myself as reasonable, so I'll just say that reasonable people may disagree with your assertion that cmake is in any way at all better than autotools.
Nope, autotools is actually the worst.
There is no way in hell anyone reasonable could say that Autotools is better than CMake.
My experience with cmake, though dated, is that it's simpler because it simply cannot do what autotools can do.
It really smelled of "oh I can do this better", and you rewrite it, and as part of rewriting it you realise oh, this is why the previous solution was complicated. It's because the problem is actually more complex than I though.
And then of course there's the problem where you need to install on an old release. But the thing you want to install requires a newer cmake (autotools doesn't have this problem because it's self contained). But this is an old system that you cannot upgrade, because the vendor support contract for what the server runs would be invalidated. So now you're down a rabbit hole of trying to get a new version of cmake to build on an unsupported system. Sigh. It's less work to just try to construct `gcc` commands yourself, even for a medium sized project. Either way, this is now your whole day, or whole week.
If only the project had used autotools.
No, CMake can do everything Autotools does, but a hell of a lot simpler and without checking for a gazillion flags and files that you don't actually need to but you're checking them anyway because you copied the script from a someone who copied the script from... all the way back to the 90s when C compilers actually existed that didn't have stdint.h or whatever.
CMake is easy to upgrade. There are binary downloads. You can even install it with pip (although recently the Python people in their usual wisdom have broken that).
CMake can't do everything autotools does, but the stuff autotools does which CMake doesn't isn't relevant anymore in today's world.
The fundamental curse of build systems is that they are inherently complex beasts that hardly anybody has to work with full-time, and so hardly anybody learns them to the necessary level of detail.
The only way out of this is to simplify the problem space. Sometimes for real (by reducing the number of operating systems and CPU architectures that are relevant -- e.g. CMake vs. Autotools) and sometimes by somewhat artificially restricting yourself to a specific niche (e.g. Cargo).
It is relevant still, because sometimes you get a vendor system under support contract (can't be upgraded as a whole).
If you only support x64 Linux and at least as new as latest Debian stable, then I don't feel like you should be talking about these things being too complex.
I don't laugh at plumbers for having a van full of obscure tools, when they just needed a wrench to fix my problem.
Binary downloads? Backward compatibility may allow you to run a 5 year old binary on a system from today, but running a new binary on a 5 year old system is not even a goal.
Choke is easy to upgrade on a modern system, maybe. But that defeats the point, you could just be upgraded normally then.
Or maybe, maybe an old Linux x86. But if that's all you were trying to support then what was the point of cmake in the first place.
It was a few years ago now, so I don't remember the scenario, but no it was absolutely not easy to install/upgrade cmake.
You complain about support for 90s compilers, but it's really helpful when you're trying to install on something obscure. Almost always autotools just works. Cmake, if it's not a Linux or Mac, good luck.
"Choke" is autocorrect for "CMake". Not any intentional diss.
Comment is to old to edit, now.
I've seen programs replicate autotools in their Makefiles. That's actually worse. I've also used the old Visual Studio build tooling.
Autotools is terrible, but it's not the worst.
Configure-make is easier to use for someone else. Configuring a cmake based project is slightly harder. In every other conceivable way I agree 100% (until someone can convince me otherwise)
And presumably the measure by which they are judged to be reasonable or not is if they prefer CMake over Autotools, correct? :D
Correct. I avoid autotools and cmake as much as I can. I'd better write Makefiles by hand. But when I need to deal with them, I'd prefer cmake. I can can modify CMakeLists.txt in a meaningful way and get the results I want. I wouldn't touch autotools build system because I never was able to figure out which of the files is the configuration that is meant to be edited by hands and not generated by scripts in other files. I tried to dig the documentation but I never made it.
> CMake is a mess but even that is faaaaar saner than autotools, and probably more popular at this point.
Having done a deep dive into CMake I actually kinda like it (really modern cmake is actually very nice, except the DSL but that probably isn't changing any time soon), but that is also the problem: I had to do a deep dive into learning it.
Someone who doesn't want to understand a huge mess should probably not be bringing it into their project.
In software you sometimes have to have the courage to reject doing what others do, especially if they're only doing it because of others.
This.
Simple projects: just use plain C. This is dwm, the window manager that spawned a thousand forks. No ./configure in sight: <https://git.suckless.org/dwm/files.html>
If you run into platform-specific stuff, just write a ./configure in simple and plain shell: <https://git.suckless.org/utmp/file/configure.html>. Even if you keep adding more stuff, it shouldn't take more than 100ms.
If you're doing something really complex (like say, writing a compiler), take the approach from Plan 9 / Go. Make a conditionally included header file that takes care of platform differences for you. Check the $GOARCH/u.h files here:
<https://go.googlesource.com/go/+/refs/heads/release-branch.g...>
(There are also some simple OS-specific checks: <https://go.googlesource.com/go/+/refs/heads/release-branch.g...>)
This is the reference Go compiler; it can target any platform, from any host (modulo CGO); later versions are also self-hosting and reproducible.
I want to agree with you, but as someone who regularly packages software for multiple distributions I really would prefer people using autoconf.
Software with custom configure scripts are especially dreaded amongst packagers.
Why, again, software in the Linux world has to be packaged for multiple distributions? On the Windows side, if you make installer for Windows 7, it will still work on Windows 11. And to the boot, you don't have to go through some Microsoft-approved package distibution platform and its approval process: you can, of course, but you don't have to, you can distribute your software by yourself.
> Why, again, software in the Linux world has to be packaged for multiple distributions?
Because a different distribution is a different operating system. Of course, not all distributions are completely different and you don't necessarily need to make a package for any particular distribution at all. Loads of software runs just fine being extracted into a directory somewhere. That said, you absolutely can use packages for older versions of a distribution in later versions of the same distribution in many cases, same as with Windows.
> And to the boot, you don't have to go through some Microsoft-approved package distribution platform and its approval process: you can, of course, but you don't have to, you can distribute your software by yourself.
This is the same with any Linux distribution I've ever used. It would be a lot of work for a Linux distribution to force you to use some approved distribution platform even if it wanted to.
As michaelmior has already noted, Linux is not an OS. Anyone is free to take the sources and do as they wish (modulo GPL), which is what a lot of people did. Those people owe you nothing.
But consider FreeBSD. Contrary to Linux, it is a full, standalone operating system, just like Windows or macOS. It has pretty decent compatibility guarantees for each major release (~5 years of support). It also has an even more liberal license (it boils down to "do as you wish but give us credit").
Consider macOS. Apple keeps supporting 7yro hardware with new releases, and even after that keeps the security patches flowing for a while. Yet still, they regularly cull backwards compatibility to keep moving forward (e.g. ending support for 32-bit Intel executables to pave the way for Arm64).
Windows is the outlier here. Microsoft is putting insane amounts of effort into maintaining backwards compatibility, and they are able to do so only because of their unique market position.
> On the Windows side, if you make installer for Windows 7, it will still work on Windows 11.
Do you speak from experience or from anecdotes ?
Interesting that you would bring up Go. Go is probably the most head-desk language of all for writing portable code. Go will fight you the whole way.
Even plain C is easier.
You can have a whole file be for OpenBSD, to work around that some standard library parts have different types on different platforms.
So now you need one file for all platforms and architectures where Timeval.Usec is int32, and another file for where it is int64. And you need to enumerate in your code all GOOS/GOARCH combinations that Go supports or will ever support.
You need a file for Linux 32 bit ARM (int32/int32 bit), one for Linux 64 bit ARM (int64,int64), one for OpenBSD 32 bit ARM (int64/int32), etc…. Maybe you can group them, but this is just one difference, so in the end you'll have to do one file per combination of OS and Arch. And all you wanted was pluggable "what's a Timeval?". Something that all build systems solved a long time ago.
And then maybe the next release of OpenBSD they've changed it, so now you cannot use Go's way to write portable code at all.
So between autotools, cmake, and the Go method, the Go method is by far the worst option for writing portable code.
I have specifically given an example of u.h defining types such as i32, u64, etc to avoid running a hundred silly tests like "how long is long", "how long is long long", etc.
> So now you need one file for all platforms and architectures where Timeval.Usec is int32, and another file for where it is int64. And you need to enumerate in your code all GOOS/GOARCH combinations that Go supports or will ever support.
I assume you mean [syscall.Timeval]?
$ go doc syscall
[...]
Package syscall contains an interface to the low-level operating system
primitives. The details vary depending on the underlying system [...].
Do you have a specific use case for [syscall], where you cannot use [time]?Yeah I've had specific use cases when I need to use syscall. I mean... if there weren't use cases for syscall then it wouldn't exist.
But not only is syscall an example of portability done wrong for APIs, as I said it's also an example of it being implemented in a dumb way causing needless work and breakage.
Syscall as implementation leads by bad example because it's the only method Go supports.
Checking for GOARCH+GOOS tuple equality for portable code is a known anti pattern, for reasons I've said and other ones, that Go still decided to go with.
But yeah, autotools scripts often check for way more things than actually matter. Often because people copy paste configure.ac from another project without trimming.
Maybe explain how would you have exposed the raw syscall interface in a high-level, GC'd language with userspace scheduling? Genuinely curious (I'm a bit of a PL nerd)
Well, for one I think it's completely unnecessary, or maybe I should say exceedingly lazy, to expose such a 1:1 mapping. Does syscall.Select need to take a struct that's exactly equal in member types to select(2)?
Who is that for? Someone fuzztesting the kernel? You know what, if you're fuzztesting the kernel then maybe you can implement this yourself, instead of forcing needless unportability onto everyone who is not fuzztesting the kernel.
And when I say exceedingly lazy, I mean the comment in the offending file saying "// THIS FILE IS GENERATED BY THE COMMAND AT THE TOP; DO NOT EDIT".
Of course you could ask why I even need syscall.Select. One example is that I needed to check if a read() would block before reading. Shouldn't I instead use goroutines and a synchronous read? Maybe. Sometimes. But the file descriptor could have come from a library, and the read is in a callback, and leaving a pending read after returning from the callback could be undefined or a race condition.
Ok, so wrap it with os.NewFile, set a read deadline, try to read, then set it back. But "if the file descriptor is in non-blocking mode, NewFile will attempt to return a pollable File (one for which the SetDeadline methods work)". And it seems that NewFile "takes ownership" of the fd, closing it when the finalizer runs.
I guess I could Dup() it first, and handle all the edge cases to prevent fd leaks.
Dude, I just want to call select(). Not rely on if it's in non-blocking mode, and fight os.File.
There's a trending post right now for printf implemented in bare metal and my first thought was "finally, all that autoconf code that checks for printf can handle the use can where it doesn't exist".
> either they are lazy or don't understand them enough to do it themselves.
Meh, I used to keep printed copies of autotools manuals. I sympathize with all of these people and acknowledge they are likely the sane ones.
I've had projects where I spent more time configuring autoconf than actually writing code.
That's what you get for wanting to use a glib function.
It’s always wise to be specific about the sizes you want for your variables. You don’t want your ancient 64-bit code to act differently on your grandkids 128-bit laptops. Unless, of course, you want to let the compiler decide whether to leverage higher precision types that become available after you retire.
Noticed an easter egg in this article. The text below "I'm sorry, but in the year 2025, this is ridiculous:" is animated entirely without Javascript or .gif files. It's pure CSS.
This is how it was done: https://github.com/tavianator/tavianator.com/blob/cf0e4ef26d...
Unfortunately it forgets to HTML-escape the <wchar.h> etc.
I did something like the system described in this article a few years back. [1]
Instead of splitting the "configure" and "make" steps though, I chose to instead fold much of the "configure" step into the "make".
To clarify, this article describes a system where `./configure` runs a bunch of compilations in parallel, then `make` does stuff depending on those compilations.
If one is willing to restrict what the configure can detect/do to writing to header files (rather than affecting variables examined/used in a Makefile), then instead one can have `./configure` generate a `Makefile` (or in my case, a ninja file), and then have the "run the compiler to see what defines to set" and "run compiler to build the executable" can be run in a single `make` or `ninja` invocation.
The simple way here results in _almost_ the same behavior: all the "configure"-like stuff running and then all the "build" stuff running. But if one is a bit more careful/clever and doesn't depend on the entire "config.h" for every "<real source>.c" compilation, then one can start to interleave the work perceived as "configuration" with that seen as "build". (I did not get that fancy)
Nice! I used to do something similar, don't remember exactly why I had to switch but the two step process did become necessary at some point.
Just from a quick peek at that repo, nowadays you can write
#if __has_attribute(cold)
and avoid the configure test entirely. Probably wasn't a thing 10 years ago though :)
The problem is that the various `__has_foo` aren't actually reliable in practice - they don't tell you if the attribute, builtin, include, etc. actually works the way it's supposed to without bugs, or if it includes a particular feature (accepts a new optional argument, or allows new values for an existing argument, etc.).
#if __has_attribute(cold)
You should use double underscores on attribute names to avoid conflicts with macros (user-defined macros beginning with double underscores are forbidden, as identifiers beginning with double underscores are reserved). #if __has_attribute(__cold__)
# warning "This works too"
#endif
static void __attribute__((__cold__))
foo(void)
{
// This works too
}yep. C's really come a long way with the special operators for checking if attributes exist, if builtins exist, if headers exist, etc.
Covers a very large part of what is needed, making fewer and fewer things need to end up in configure scripts. I think most of what's left is checking for items (types, functions) existence and their shape, as you were doing :). I can dream about getting a nice special operator to check for fields/functions, would let us remove even more from configure time, but I suspect we won't because that requires type resolution and none of the existing special operators do that.
You still need a configure step for the "where are my deps" part of it, though both autotools and CMake would be way faster if all they were doing was finding, and not any testing.
True. That isn't something the compiler can do.
That said, that (determining the c flags and ld flags for dependencies) is something that might be able to be mixed into compilation a bit more than it is now. Could imagine that if we annotate which compilation units need a particular system library, we could start building code that doesn't depend on that library while determining the library location/flags (ie: running pkg-config or doing other funny business) at the same time.
Or since we're in the connected era, perhaps we're downloading the library we require if it's not found and building it as an embedded component.
With that type of design, it becomes more clear why moving as much to the build stage (where we can better parallelize work because most of the work is in that stage) and more accurately describing dependencies (so we don't block work that could run sooner) can be helpful in builds.
Doing that type of thing requires a build system that is more flexible though: we really would need to have the pieces of "work" run by the build system be able to add additional work that is scheduled by the build system dynamically. I'm not sure there are many build systems that support this.
Download/build on demand is cute when it works, but it's a security nightmare and a problem for Nix which runs the build in an environment that's cut off from the network.
This is already a problem for getting Bazel builds to run nicely under Nix, with the current solution (predownload everything into a single giant "deps" archive in the store and then treat that as a fixed input derivation with a known hash value) is deeply non-optimal. Basically, I hope that any such schemes have a well-tested fallback path for bubbling the "thing I would download" information outward in case there are reasons to want to separate those steps.
I agree that there are problems when laying multiple build systems on top of one another, and I see that often as a user of nix (it's also bad with rust projects that use cargo, and though there are a variety of options folks have written they all have tradeoffs).
To some extent, the issue here is caused by just what I was discussing above: Nix derivations can't dynamically add additional derivations (ie: build steps not being able to dynamically add additional build steps makes things non-optimal).
I am hopeful that Nix's work on dynamic derivations will improve the situation for nix (with respect to bazel, cargo, and others) over time, and I am hopeful that other build systems will recognize how useful dynamically adding build steps can be.
It's true— fundamentally, nothing about a build realizing partway through that it needs more stuff breaks the Nix philosophy, assuming the build is holding a hash for whatever it is it wants to pull so that everything stays hermetic. It's a bit annoying to not know upfront exactly what your build graph looks like but honestly it's not the worst— like, you already don't know how long each derivation is going to take.
In any case, the tvix devs have definitely understood the assignment on this and are making not only ifd a first class citizen, but also the larger issue of allowing the evaluation step to decompose, and for the decomposed pieces to run in parallel with each other and with builds— and that really is the game-changer, particularly with a cluster-backed build, to be able to start work immediately rather than waiting out a 30-60 second single-threaded eval.
GNU Parallel seems like another convenient approach.
It has no concept of dependencies between tasks, or doing a topological sort prior to running the task queue. GNU Make's parallel mode (-j) has that.
I've spent a fair amount of time over the past decades to make autotools work on my projects, and I've never felt like it was a good use of time.
It's likely that C will continue to be used by everyone for decades to come, but I know that I'll personally never start a new project in C again.
I'm still glad that there's some sort of push to make autotools suck less for legacy projects.
You can use make without configure. If needed, you can also write your own configure instead of using auto tools.
Creating a make file is about 10 lines and is the lowest friction for me to get programming of any environment. Familiarity is part of that.
It's a bit of a balance once you get bigger dependencies. A generic autoconf is annoying to write, but rarely an issue when packaging for a distro. Most issues I've had to fix in nixpkgs were for custom builds unfortunately.
But if you don't plan to distribute things widely (or have no deps).. Whatever, just do what works for you.
Write your own configure? For an internal project, where much is under domain control, sure. But for the 1000s of projects trying to multi-plarform and/or support flavours/versions - oh gosh.
It depends on how much platform specific stuff you are trying to use. Also in 2025 most packages are tailored for the operating system by packagers - not the original authors.
Autotools is going to check every config from the past 50 years.
>Also in 2025 most packages are tailored for the operating system by packagers - not the original authors.
No? Most operating systems don't have a separate packager. They have the developer package the application.
Yes? Each operating system is very different and almost every package has patches or separate install scripts.
To extend on sibling comments:
autoconf is in no way, shape or form an "official" build system associated with C. It is a GNU creation and certainly popular, but not to a "monopoly" degree, and it's share is declining. (plain make & meson & cmake being popular alternatives)
I've stopped using autotools for new projects. Just a Makefile, and the -j flag for concurrency.
cmake ftw
CMake also does sequential configuration AFAIK. Is there any work to improve on that somewhere?
Meson and cmake in my experience are both MUCH faster though. It’s much less of an issue with these systems than with autotools.
Just tried reconfiguring LLVM:
27.24s user 8.71s system 99% cpu 36.218 total
Admittedly the LLVM build time dwarfs the configuration time, but still. If you're only building a smaller component then the config time dominates: ninja libc 268.82s user 26.16s system 3246% cpu 9.086 totalYou mean cargo build
... can cargo build things that aren't rust? If yes, that's really cool. If no, then it's not really in the same problem domain.
No it can't.
It can build a Rust program (build.rs) which builds things that aren't Rust, but that's an entirely different use case (building non-Rust library to use inside of Rust programs).
There's GprBuild (Ada tool) that can build C (not sure about C++). It also has more elaborate configuration structure, but I didn't use it extensively to tell what exactly and how exactly does it do it. In combination with Alire it can also manage dependencies Cargo-style.
cmake uses configure, or configure-like too!
Same concept, but completely different implementation.
still slow, even if it does multiple processes - lots of "discovery" that are only on your machine, or whatever the CI machine had in mind. Instead it should be - "You should expect this, that and that", and by nature of building it - it fails.
Discovery is the wrong thing to do nowadays.
I don't think I'd want a "fail late with a possibly unclear error" build system. There is also the problem that finding the path of dependencies happens at the same time as finding if you have them, and removing one without the other doesn't seem to be very useful.
At best, I think you could have a system that defers some / most dependency discovery until after configure time, but still aborts the build with "required libfoo >= 0.81.0 not found" if necessary.
And no, you are not going to be able to tell everyone exactly where everything needs to be installed unless it's an internal product.
Sure, to all to their needs. In my case I want reproducible build, but to start I first want build that at least passes the same compiler flags around, and links same things no matter what the computer is (it can still be different when comes to OS).
And on macOS, the notarization checks for all the conftest binaries generated by configure add even more latency. Apple reneged on their former promise to give an opt-out for this.
Very nice! I always get annoyed when my fancy 16 thread CPU is left barely used as one thread is burning away with the rest sitting and waiting. Bookmarking this for later to play around with whatever projects I use that still use configure.
Also, I was surprised when the animated text at the top of the article wasn't a gif, but actual text. So cool!
Autoconf can use cache files [1], which can greatly speed up repeated configures. With cache, a test is run at most once.
[1] https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/a...
Sadly the cache files don’t record enough about the environment to be usable if you change configure options. They are generally unreliable.
CMake also needs this, badly...
Agreed! The CMake Xcode generator is extremely slow because not only is it running the configure tests sequentially, but it generates a new Xcode project for each of them.
I get the impression configure not only runs sequentially, but incrementally, where previous results can change the results of tests run later. Were it just sequential, running multiple tests as separate processes would be relatively simple.
Also, you shouldn’t need to run ./configure every time you run make.
No, but if you are doing something like rebuilding a distro's worth of packages from source from scratch, the configure step starts to dominate. I build around 550, and it takes around 6 hours on a single node.
Most checks are common, so what can help is having a shared cache for all configure scripts so if you have 400 packages to rebuild, it doesn't check 400 times if you should use flock or fcntl. This approach is described here: https://jmmv.dev/2022/06/autoconf-caching.html
It doesn't help that autoconf is basically abandonware, with one forlorn maintainer trying to resuscitate it, but creating major regressions with new releases: https://lwn.net/Articles/834682/
> It doesn't help that autoconf is basically abandonware
A far too common tragedy of our age.
I don't disagree with that general premise but IMO autotools being (gradually?) abandoned is logical. It served its purpose. Not saying it's still not very useful in the darker shadows of technology but for a lot of stuff people choose Zig, Rust, Golang etc. today, with fairly good reasons too, and those PLs usually have fairly good packaging and dependency management and building subsystems built-in.
Furthermore, there really has to be a better way to do what autotools is doing, no? Sure, there are some situations where you only have some bare sh shell and nothing much else but I'd venture to say that in no less than 90% of all cases you can very easily have much more stuff installed -- like the `just` task runner tool, for example, that solves most of the problems that `make` usually did.
If we are talking in terms of our age, we also have to take into account that there's too much software everywhere! I believe some convergence has to start happening. There is such a thing as too much freedom. We are dispersing so much creative energy for almost no benefit of humankind...
>The purpose of a ./configure script is basically to run the compiler a bunch of times and check which runs succeeded.
Wait is this true? (!)
Historically, different Unixes varied a lot more than they do today. Say you want your program to use the C library function foo on platforms where it’s available and the function bar where it isn’t: You can write both versions and choose between them based on a C preprocessor macro, and the program will use the best option available for the platform where it was compiled.
But now the user has to set the preprocessor macro appropriately when he builds your program. Nobody wants to give the user a pop quiz on the intricacies of his C library every time he goes to install new software. So instead the developer writes a shell script that tries to compile a trivial program that uses function foo. If the script succeeds, it defines the preprocessor macro FOO_AVAILABLE, and the program will use foo; if it fails, it doesn’t define that macro, and the program will fall back to bar.
That shell script grew into configure. A configure script for an old and widely ported piece of software can check for a lot of platform features.
I'm not saying we should send everyone a docker container with a full copy of ubuntu, electron and foo.js whether they have foo in their c library or not, but maybe there is a middle ground?
I think this is a gigantic point in favor of interpreted languages.
JS and Python wouldn't be what they are today if you had to `./configure` every website you want to visit, lmao.
> JS and Python wouldn't be what they are today if you had to `./configure` every website you want to visit, lmao.
You just gave me a flashback to the IE6 days. Yes, that's precisely what we did. On every page load.
It's called "feature detection", and was the recommended way of doing things (the bad alternative was user agent sniffing, in which you read the user agent string to guess the browser, and then assumed that browser X always had feature Y; the worst alternative was to simply require browser X).
The closer and deeper you look into the C toolchains the more grossed out you’ll be
Hands have to get dirty somewhere. "As deep as The Worker's City lay underground, so high above towered the City of Metropolis."
The choices are:
1. Restrict the freedom of CPU designers to some approximation of the PDP11. No funky DSP chips. No crazy vector processors.
2. Restrict the freedom of OS designers to some approximation of Unix. No bespoke realtime OSes. No research OSes.
3. Insist programmers use a new programming language for these chips and OSes. (This was the case prior to C and Unix.)
4. Insist programmers write in assembly and/or machine code. Perhaps a macro-assembler is acceptable here, but this is inching toward C.
The cost of this flexibility is gross tooling to make it manageable. Can it be done without years and years of accrued M4 and sh? Perhaps, but that's just CMake and CMake is nowhere near as capable as Autotools & friends are when working with legacy platforms.
There is no real technical justification for the absolute shit show that is the modern C toolchain
Technical? No. Social? Yes.
Fixing it would require an unprecedented level of cooperation across multiple industries.
I’d argue there isn’t a social justification for the shit show, but rather just social reasons
The justification emerges when you start thinking about how to build a true "package management" system for C libraries which embraces the fact that a world exists beyond GNU/Linux on x86_64.
At some point you realize that not even the (in)famous "host triple" has sufficient coverage for the fractal complexity of the real world.
C and its toolchain are where so many of these tedious little differences are managed. Making tools which account only for the 80% most common use-case defeats the purpose.
How would one encode e.g. "this library targets only the XYZ CPU in Q mode for firmware v5-v9 and it compiles only with GCC v2.95"?
The tools in use need to be flexible enough to handle cases like this with grace or you just end up reinventing GNU Autotools and/or managing your build with a hairy shell script of your own making.
The GNU Autotools system organically grew from dealing with the rapidly-changing hardware and software landscape of the 1980s-1990s. DIY shell scripts were organized, macros were written, and a system was born.
If embedded folk have to start writing their own scripts to handle the inevitable edge cases which WILL come up, then what is a new build tool really accomplishing-- Autotools, but in Python?
That is not a _justification_ - merely an _explanation_ of how we got here. I fully agree that a historical understanding of how we got to where we are is essential for understanding why we have what we have. However, it's critical to not conflate the history and evolution as a justification for why the current system sucks so much! I agree there is a massive amount of complexity and branching to consider, but autotools and the whole C ecosystem does a terrible job of tackling that complexity, and introduces a huge amount of accidental complexity into the solution space.
The co-evolution of hardware, software, and all other moving targets has landed us in a fairly abysmal local maxima. More recently developed toolchains (e.g. zig, rust, etc.) show us that there are much better ways to tackle these problems. Of course they introduce other ones, but we can do so much better.
I like C/C++ a lot, A LOT, and I agree with your comment.
Man, if this got fixed it would be one of the best languages to develop for.
My wishlist:
* Quick compilation times (obv.) or some sort of tool that makes it feel like an interpreted language, at least when you're developing, then do the usual compile step to get an optimized binary.
* A F...... CLEAR AND CONSISTENT WAY TO TELL THE TOOLCHAIN THIS LIBRARY IS HERE AND THIS ONE IS OVER THERE (sorry but, come on ...).
* A single command line argument to output a static binary.
* Anything that gets us closer to the "build-once run-anywhere" philosophy of "Cosmopolitan Libc". Even if an intermediate execution layer is needed. One could say, "oh, but this is C, not Java", but it is already de facto a broken Java, because you still need an execution layer, call it stdlib, GLIB, whatever, if those shared libraries are not on your system with their exact version matching, your program breaks ... Just stop pretending and ship the "C virtual machine", lmao.
I've implemented a configuration caching mechanism for myself (in one important project) which stores configuration artifacts in a cache directory, associated by the commit hash. It works as a git hook:
$ git bisect good
Bisecting: 7 revisions left to test after this (roughly 3 steps)
restored cached configuration for 2f8679c346a88c07b81ea8e9854c71dae2ade167
[2f8679c346a88c07b81ea8e9854c71dae2ade167] expander: noexpand mechanism.
The "restored cached configuration" message is from the git hook. What it's not saying is that it also saved the config for the commit it is navigating away from.I primed the cache by executing a "git checkout" for each of a range of commits.
Going forward, it will populate itself.
This is the only issue I would conceivably care about with regard to configure performance. When not navigating in git history, I do not often run configure.
Downstream distros do not care; they keep their machines and cores busy by building multiple packages in parallel.
It's not ideal because the cache from one host is not applicable to another; you can't port it. I could write an intelligent script to populate it, which basically identifies commits (within some specified range) that have touched the config system, and then assumes that for all in-between commits, it's the same.
The hook could do this. When it notices that the current sha doesn't have a cached configuration, it could search backwards through history for the most recent commit which does have it. If the configure script (or something influencing it) has not been touched since that commit, then its cached material can be populated for all in-between commits right through the current one. That would take care of large swaths of commits in a typical bisect session.
The right way to do this is not to rely on the git hashes, but to hash the inputs into the configuration system (those that are in version control, not the implicit environmental inputs from the platform).
For instance, if the only input to the configuration system is the body of the configure script, then we hash that. That is then our key to the generated materials.
On the topic* of having 24 cores and wanting to put them to work: when I were a lad the promise was that pure functional programming would trivially allow for parallel execution of functions. Has this future ever materialized in a modern language / runtime?
x = 2 + 2
y = 2 * 2
z = f(x, y)
print(z)
…where x and y evaluate in parallel without me having to do anything. Clojure, perhaps?*And superficially off the topic of this thread, but possibly not.
Superscalar processors (which include all mainstream ones these days) do this within a single core, provided there are no data dependencies between the assignment statements. They have multiple arithmetic logic units, and they can start a second operation while the first is executing.
But yeah, I agree that we were promised a lot more automatic multithreading than we got. History has proven that we should be wary of any promises that depend on a Sufficiently Smart Compiler.
Eh, in this case not splitting them up to compute them in parallel is the smartest thing to do. Locking overhead alone is going to dwarf every other cost involved in that computation.
Yeah, I think the dream was more like, “The compiler looks at a map or filter operation and figures out whether it’s worth the overhead to parallelize it automatically.” And that turns out to be pretty hard, with potentially painful (and nondeterministic!) consequences for failure.
Maybe it would have been easier if CPU performance didn’t end up outstripping memory performance so much, or if cache coherency between cores weren’t so difficult.
I think it has shaken out the way it has, is because compile time optimizations to this extent require knowing runtime constraints/data at compile time. Which for non-trivial situations is impossible, as the code will be run with too many different types of input data, with too many different cache sizes, etc.
The CPU has better visibility into the actual runtime situation, so can do runtime optimization better.
In some ways, it’s like a bytecode/JVM type situation.
If we can write code to dispatch different code paths (like has been used for decades for SSE, later AVX support within one binary), then we can write code to parallelize large array execution based on heuristics. Not much different from busy spins falling back to sleep/other mechanisms when the fast path fails after ca. 100-1000 attempts to secure a lock.
For the trivial example of 2+2 like above, of course, this is a moot discussion. The commenter should've lead with a better example.
Sure, but it’s a rare situation (by code path) where it will beat the CPU’s auto optimization, eh?
And when that happens, almost always the developer knows it is that type of situation and will want to tune things themselves anyway.
What kind of CPU auto-optimization? Here specifically I envisioned a macro-level optimization, when an array is detected to have length on the order of thousands/tens of thousands. I guess some advanced sorting algorithms do extend their operation to multi-thread in such cases.
For CPU machine code it's the compilers doing the hard work of reordering code to allow ILP (instruction-level parallelism), eliminate false dependencies, inlining and vectorization; whatever else it takes to keep the pipeline filled and busy.
I'd love for the sentiment "the dev knows" to be true, but I think this is no longer the case. Maybe if you are in a low-level language AND have time to reason about it? Add to this the reserved smile when I see someone "benchmarking" their piece of code in a "for i to 100000" loop, without other considerations. Next, suppose a high-level language project: the most straightforward optimization to carry out for new code is to apply proper algorithms and fitting data structures. And I think this is too much to ask nowadays, because it takes time, effort, and knowledge of existence to remember to implement something.
Spawning threads or using a thread pool implicitly would be pretty bad - it would be difficult to reason about performance if the compiler was to make these choices for you.
I think you’re fixating on the very specific example. Imagine if instead of 2 + 2 it was multiplying arrays of large matrices. The compiler or runtime would be smart enough to figure out if it’s worth dispatching the parallelism or not for you. Basically auto vectorisation but for parallelism
Notably - in most cases, there is no way the compiler can know which of these scenarios are going to happen at compile time.
At runtime, the CPU can figure it out though, eh?
I mean, theoretically it's possible. A super basic example would be if the data is known at compile time, it could be auto-parallelized, e.g.
int buf_size = 10000000;
auto vec = make_large_array(buf_size);
for (const auto& val : vec)
{
do_expensive_thing(val);
}
this could clearly be parallelised. In a C++ world that doesn't exist, we can see that it's valid.If I replace it with int buf_size = 10000000; cin >> buf_size; auto vec = make_large_array(buf_size); for (const auto& val : vec) { do_expensive_thing(val); }
the compiler could generate some code that looks like: if buf_size >= SOME_LARGE_THRESHOLD { DO_IN_PARALLEL } else { DO_SERIAL }
With some background logic for managing threads, etc. In a C++-style world where "control" is important it likely wouldn't fly, but if this was python...
arr_size = 10000000
buf = [None] * arr_size
for x in buf:
do_expensive_thing(x)
could be parallelised at compile time.Which no one really does (data is generally provided at runtime). Which is why ‘super smart’ compilers kinda went no where eh?
I dunno. I was promised the same things when I started programming and it never materialised.
It doesn’t matter what people do or don’t do because this is a hypothetical feature of a hypothetical language that doesn’t exist.