The .a file is a relic: Why static archives were a bad idea all along

medium.com

71 points by eyalitki 4 days ago


TuxSH - 19 hours ago

> This design decision at the source level, means that in our linked binary we might not have the logic for the 3DES building block, but we would still have unused decryption functions for AES256.

Do people really not know about `-ffunction-sections -fdata-sections` & `-Wl,--gc-sections` (doesn't require LTO)? Why is it used so little when doing statically-linked builds?

> Let’s say someone in our library designed the following logging module: (...)

Relying on static initialization order, and on runtime static initialization at all, is never a good idea IMHO

amiga386 - 19 hours ago

> Yet, what if the logger’s ctor function is implemented in a different object file?

This is a contrived example akin to "what if I only know the name of the function at runtime and have to dlsym()"?

Have a macro that "enables use of" the logger that the API user must place in global scope, so it can write "extern ctor_name;". Or have library specific additions for LDFLAGS to add --undefined=ctor_name

There are workarounds for this niche case, and it doesn't add up to ".a files were a bad idea", that's just clickbait. You'll appreciate static linkage more on the day after your program survives a dynamic linker exploit

> Every non-static function in the SDK is suddenly a possible cause of naming conflict

Has this person never written a C library before? Step 1: make all globals/functions static unless they're for export. Step 2: give all exported symbols and public header definitions a prefix, like "mylibname_", because linkage has a global namespace. C++ namespaces are just a formalisation of this

kazinator - 16 hours ago

.a archives can speed up linking of very large software. This is because of assumptions as to the dependencies and the way the traditional Unix-style linker deals with .a files (by default).

When a bunch of .o files are presented to the linker, it has to consider references in every direction. The last .o file could have references to the first one, and the reverse could be true.

This is not so for .a files. Every successive .a archive presented on the linker command line in left-to-right order is assumed to satisfy references only in material to the left of it. There cannot be circular dependencies among .a files and they have to be presented in topologically sorted order. If libfoo.a depends on libbar.a then libfoo.a must be first, then libbar.a.

(The GNU Linker has options to override this: you can demarcate a sequence of archives as a group in which mutual references are considered.)

This property of archives (or of the way they are treated by linking) is useful enough that at some point when the Linux kernel reached a certain size and complexity, its build was broken into archive files. This reduced the memory and time needed for linking it.

Before that, Linux was linked as a list of .o files, same as most programs.

rixed - 18 hours ago

Do people who write this kind of pieces with such peremptory titles really believe that they finally came about to understand everything better after decades of ignorance?

Chesterton’s Fence yada yada?

EE84M3i - 19 hours ago

Something I've never quite understood is why can't you statically link against an so file? What specific information was lost during the linking phase to create the shared object that presents that machine code from being placed into a PIE executable?

tux3 - 19 hours ago

I actually wrote a tool a to fix exactly this asymmetry between dynamic libraries (a single object file) and static libraries (actually a bag of loose objects)

I never really advertised it, but what it does is take all the objects inside your static library, and tells the linker to make a static library that contains a single merged object.

https://github.com/tux3/armerge

The huge advantage is that with a single object, everything works just like it would for a dynamic library. You can keep a set of public symbols and hide your private symbols, so you don't have pollution issues.

Objects that aren't needed by any public symbol (recursively) are discarded properly, so unlike --whole-archive you still get the size benefits of static linking.

And all your users don't need to handle anything new or to know about a new format, at the end of the day you still just ship a regular .a static library. It just happens to contain a single object.

I think the article's suggestion of a new ET_STAT is a good idea, actually. But in the meantime the closest to that is probably to use ET_REL, a single relocatable object in a traditional ar archive.

cryptonector - 15 hours ago

It's not that .a files and static linking are a relic, but that static linking never evolved like dynamic linking did. Static linking is stuck with 1978 semantics, while dynamic linking has grown features that prevent the mess that static linking made. There are legit reasons for wanting static linking in 2025, so we really ought to evolve static linking like we did dynamic linking.

Namely we should:

  - make -l and -rpath options in
    .a generation do something:
    record that metadata in the .a

  - make link-edits use that meta-
    data recorded in .a files in
    the previous item
I.e., start recording dependency metadata in .a files and / so we can stop flattening dependency trees onto the final link-edit.

This will allow static linking to have the same symbol conflict resolution behaviors as dynamic linking.

dzaima - 19 hours ago

How possible would it be to have a utility that merges multiple .o files (or equivalently a .a file) into one .o file, via changing all hidden symbols to local ones (i.e. alike C's "static")? Would solve the private symbols leaking out, and give a single object file that's guaranteed to link as a whole. Or would that break too many assumptions made by other things?

stabbles - 19 hours ago

Much of the dynamic section of shared libraries could just be translated to a metadata file as part of a static library. It's not breaking: the linker skips files in archives that are not object files.

binutils implemented this with `libdep`, it's just that it's done poorly. You can put a few flags like `-L /foo -lbar` in a file `__.LIBDEP` as part of your static library, and the linker will use this to resolve dependencies of static archives when linking (i.e. extend the link line). This is much like DT_RPATH and DT_NEEDED in shared libraries.

It's just that it feels a bit half-baked. With dynamic linking, symbols are resolved and dependencies recorded as you create the shared object. That's not the case when creating static libraries.

But even if tooling for static libraries with the equivalent of DT_RPATH and DT_NEEDED was improved, there are still the limitations of static archives mentioned in the article, in particular related to symbol visibility.

kazinator - 13 hours ago

> Yet, what if the logger’s ctor function is implemented in a different object file? Well, tough luck. No one requested this file, and the linker will never know it needs to link it to our static program. The result? crash at runtime.

If you have spontaneously called initialization functions as part of an initialization system, then you need to ensure that the symbols are referenced somehow. For instance, a linker script which puts them into a table that is in its own section. Some start-up code walks through the table and calls the functions.

This problem has been solved; take a look at how U-boot and similar projects do it.

This is not an archive problem because the linker will remove unused .o files even if you give it nothing but a list of .o files on the command line, no archives at all.

dale_glass - 16 hours ago

Oh, static linking can be lots of "fun". I ran into this interesting issue once.

1. We have libshared. It's got logging and other general stuff. libshared has static "Foo foo;" somewhere.

2. We link libshared into libfoo and libbar.

3. libfoo and libbar then go into application.

If you do this statically, what happens is that the Foo constructor gets invoked twice, once from libfoo and once from libbar. And also gets destroyed twice.

flohofwoe - 14 hours ago

Library files are not the problem, deploying an SDK as precompiled binary blobs is ;)

(I bet that .a/.lib files were originally never really meant for software distribution, but only as intermediate file format between a compiler and linker, both running as part of the same build process)

jhallenworld - 13 hours ago

On the private symbol issue... there is probably a solution to this already. You can partially link a bunch of object files into a single object file (see ld -r). After this is done, 'strip' the file except for those symbols marked with non-hidden visibility- I've not tried to do this, maybe 'strip -x' does the right thing? Not sure.

SanjayMehta - an hour ago

Unix originated on the PDP-11, a machine with very limited memory and disk space. At that time, this was not only the right solution, it was probably the only solution.

Calling it “a bad idea all along” is undeserved.

layer8 - 16 hours ago

> Something like a “Static Bundle Object” (.sbo) file, that will be closer to a Shared Object (.so) file, than to the existing Static Archive (.a) file.

Is there something missing from .so files that wouldn’t allow them to be used as a basis for static linking? Ideally, you’d only distribute one version of the library that third parties can decide to either link statically or dynamically.

harryvederci - 19 hours ago

Minor suggestion: the article refers to a RHEL 6 developer guide section about static linking. Maybe a more recent article can be used (if their viewpoint hasn't changed).

parpfish - 15 hours ago

relic isn't the right word.

relics are really old things that are revered and honored.

i think they just want archaic which are old things that are likely obsolete

benreesman - 19 hours ago

It is unclear to me what the author's point is. Its seems to center on the example of DPDK being difficult to link (and it is a bear, I've done it recently).

But its full of strawmen and falsehoods, the most notable being the claims about the deficienies of pkg-config. pkg-config works great, it is just very rarely produced correctly by CMake.

I have tooling and a growing set of libraries that I'll probably open source at some point for producing correct pkg-config from packages that only do lazy CMake. It's glorious. Want abseil? -labsl.

Static libraries have lots of game-changing advantages, but performance, security, and portability are the biggest ones.

People with the will and/or resources (FAANGs, HFT) would laugh in your face if you proposed DLL hell as standard operating procedure. That shit is for the plebs.

It's like symbol stripping: do you think maintainers trip an assert and see a wall of inscrutable hex? They do not.

Vendors like things good for vendors. They market these things as being good for users.

uecker - 13 hours ago

Isn't this what partial linking is for, combining object files into a larger one?

high_na_euv - 18 hours ago

.so .o .a .pc holy shit, what a mess

Why things that are solved in other programming ecosystems are impossible in c cpp world, like sane building system