Contracts for C
gustedt.wordpress.com107 points by joexbayer 9 days ago
107 points by joexbayer 9 days ago
There's a bit of an impedence mismatch with Contracts in C because C++ contracts exist partially to rectify the fact that <cassert> is broken in C++.
Let's say you have a header lib.h:
inline int foo(int i) {
assert(i > 0);
//...
}
In C, this function is unspecified behavior that will probably work if the compiler is remotely sane.In C++, including this in two C++ translation units that set the NDEBUG flag differently creates an ODR violation. The C++ solution to this problem was a system where each translation unit enforces its own pre- and post- conditions (potentially 4x evaluations), and contracts act as carefully crafted exceptions to the vast number of complicated rules added on top. An example is how observable behavior is a workaround for C++ refusing to adopt C's fix for time-traveling UB. Lisa Lippincott did a great talk on this last year: https://youtu.be/yhhSW-FSWkE
There's not much left of Contracts once you strip away the stuff that doesn't make sense in C. I don't think you'd miss anything by simply adding hygenic macros to assert.h as the author here does, except for the 4x caller/callee verification overhead that they enforce manually. I don't think that should be enforced in the standard though. I find hidden, multiple evaluation wildly unintuitive, especially if some silly programmer accidentally writes an effectful condition.
I think the main point of pre- and post-conditions is, that the compiler can see them and prove that they match and will never be triggered. There probably will be a compiler flag for outputting all non-proved pre/postconditions, there is already -fanalyzer.
I think these conditions should be part of the type signature, different to what was suggested in the otherwise good talk you cited.
Adding conditions to the type signature would be an ABI breaking change in C++ and have nasty interactions with templates.
In general though, the compiler can't optimize across the translation unit boundary without something like LTO. The code for the callee might have already been generated by the time the caller sees that the precondition is statically satisfied.
My suggestion was for C where types are for example not encoded in the name, so I thought it only matters for type checking and optimization.
> In general though, the compiler can't optimize across the translation unit boundary
Which is why I would put it in the function signature, so it is available in both translation units. Making the code match the function signature is currently generally the responsibility of the caller. For example when I declare an argument of type double and write an integer in the call, the compiler will convert it to a double on the callers side. I think the safety story will be similar to a printf-call today. A dumb compiler does nothing, the smart compiler adds a warning/error, when the precondition fails.
My understanding is, that on the callee's side this case is simply undefined behaviour. Much like it is today for example, that you can't pass a NULL everywhere, it might be declared to be UB, but currently this is only documented and internal to the callee and not documented in the function signature.
PS: This does not conflict with my other comment (https://news.ycombinator.com/user?id=1718627440), that this can't be implemented as a macro that invokes UB:
callee (void * p, [...])
contract_assume (nonnull (p), "p can't be NULL!")
{
p[0] = foo; -> UB
}
The access through p simply becomes UB like it always was. But the contract_assume, can't be UB, since then the check and diagnostic is omitted or reordered.The ABI thing is because Lisa's talk was about C++. In C, a function can have multiple declarations as long as they're compatible (as opposed to identical).
So these declarations might coexist without issue even though they have different signatures:
extern int foo(a, b); // in include/lib.h
int foo(int a, int b); // in src/foo.h
whereas this would be incompatible const int foo(int a, int b); // "nearly" compatible
If you attach things to the prototype, then you need to sort out the compatibility rules. If contract_assume(a > 0) changes the type, the extern shouldn't be compatible. This is frequently used to allow linking against libraries compatible with older language standards while allowing newer code to benefit from newer standards like C99, C11, or C23.The C23 committee ran into this issue when introducing attributes. Their solution was just exclude attributes from the signature and say they're always compatible:
Although semantically attached to a function type, the attributes described are not part of the prototype of such a function, and redeclarations and conversions that drop such an attribute are valid and constitute compatible types.
Maybe I'm missing something, but how does this change anything to how it's now?
I can happily declare two completely incompatible functions with the same symbol name, as long as they are in separate TUs and I don't use -flto, neither the compiler nor the linker will complain and my program will simply be garbage. This won't change with incompatible contracts.
When I both show them to the compiler, when they contradict, the compiler will complain, that also doesn't change.
Of course this will not work:
extern int foo(a, b);
int foo(int a, int b) contract_assume(a > 0);
However this will: extern int foo(a, b) contract_assume(a > 0);
int foo(int a, int b);
But this isn't a problem, since this is precisely the feature we want to introduce contracts for: catching function call mismatches that are not yet expressible in the language.> while allowing newer code to benefit from newer standards
Having no contract specified should of course result in no additional restrictions being exposed beside this already present now. This wouldn't be possible:
foo(unsigned int a) contract_assume(possible(a < 0))
But I don't think anybody is arguing for that.> the compiler can see them and prove that they match and will never be triggered
This is a huge challenge for a C-like language with pervasive global state. Might be more feasible for something like Rust, but still very difficult.
You only need to check between caller and callee. If their constraints always match, it can't be triggered, if they contradict always, it will be triggered, else it can't be decided and the programmer can add annotations if he cares about, or the compiler can check what it would be like if the caller is inlined into his calling functions if that is trivial enough.
One of the biggest problems I find with contracts whenever contracts are mentioned is that nobody seems to have a really clear definition of what exactly a contract 'is' or 'should be' (with the exception of languages where contracts are a formal part of the language, that is).
I find the general concept incredibly useful, and apply it in the more general sense to my own code, but there's always a bit of "what do I actually want contracts to mean / do here" back-and-forth before they're useful.
PS: I do like how D does contracts; though I admit I haven't used D much yet, to my great regret, so I can't offer my experience of how well contracts actually work in D.
No wonder it looks less than awesome to you. A contract is just a hack. Ideally, it should not exist because the type system already covers the programmer's intent. Languages that have shitty types which cannot express very much must work around the problem with contracts.
Partially agree, but only for a very narrow definition of what is a contract, which again is the problem stated above.
A good contract system may in fact rely on type-safety as part of its implementation, but types do not necessarily cover all aspects of contracts (unless you're referring to the full gamut of theoretical generalisations of things like dependent types, substructural types, constraint logic programming, etc), and are also not necessarily limited to things that only apply at compile-time.
Aren't contracts a feature of the type system? They encode a specific type that differs from the base type by a more complex predicate, like a constraint check in SQL.
Digital Mars C++ has had contracts since, oh, the early 1990s?
> Digital Mars C++ has had contracts since, oh, the early 1990s?
I think that implementations trying out their own experimental features is normal and expected. Ideally, standards would be pull-based instead of push-based.
The real question is what prevented this feature from being proposed to the standardization committee.
It was proposed by Walter and denied by Stroustroup, probably to save C++. Karma hits back and he is trying to save C++ from Rust.
D is the result of lack of interest by the C++ committee, and I had little interest in spending literally years trying to get useful things adopted into C++.
Ironically, over the years, C++ has adopted many features popularized by D.
(like contracts!)
C++ should adopt a few more D features, like https://www.digitalmars.com/articles/C-biggest-mistake.html, compile time expression evaluation (C++ did it wrong), forward references, and modules that work. C++ should also deprecate the preprocessor, a stone-age kludge that has long been obsolete.
> D is the result of lack of interest by the C++ committee, and I had little interest in spending literally years trying to get useful things adopted into C++.
I think you are leaving out the fact that your comment applies to the post-C++98/pre-C++11 hiatus.
Once C++11 was released, the truth of the matter is that whatever steam D managed to build up, it fizzed out.
I'm also not sure if it's accurate to frame the problem with C++0x as picking up features from D. As I recall, D's selling point was that it was scrambling to provide the features covered by C++0x but users weren't forced to wait for a standard to be published to be able to use them. Once they could, there was no longer any compelling reason to bother with D anymore.
C++ is still trying to catch up with:
- compile time function execution
- modules
- no preprocessor
- memory safe arrays
- preprocessor replacement
- ranges
and so on.
> C++ is still trying to catch up with (...)
C++ modules are indeed a mess, but you are fooling yourself if you believe that the preprocessor of all things is a compelling reason to switch. In fact, I think you unwittingly proved my point on how interest in D fizzed out the moment C++11 was released.
> if you believe that the preprocessor of all things is a compelling reason to switch
The preprocessor is an unhygienic, ugly mess. Just look at the system .h files, which should be a showcase on how to use it correctly. I stand by my assessment of it.
As I am oblivious to D, may I ask if there are suitable GUI toolkits for it, or bindings? I typically use wxWidgets in C++ land.
People keep forgetting C++ design is driven by 300+ people, and the features that get into the language go to elections, that they have to win.
Stroustoup has one vote, not everything he advocates for wins votes, including having a saner C++ (Remember the Vasa paper).
> It was proposed by Walter and denied by Stroustroup, probably to save C++.
Citation needed.
For starters, where is the paper?
Well, there's this list Stroustrup offers, of systems in C++ that he would reject: [0]
[0] https://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0977r0...
One vote, C++ isn't a BDFL driven language.
Also WG14 famously killed Dennis Ritchie proposal to add fat pointers to C.
Language authors only have symbolic value once they relish control to a standards body.
I don't know what C++ is trying to do, but does everyone know about frama-c[1]?
I like Eiffel.
But if I want to use Eiffel, I’ll use Eiffel (or Sather).
I’d rather C remained C.
Maybe that’s just me?
Ada / SPARK has contracts, too, that can be proven at compile-time. In fact, Ada alone suffices, it has pre- and post-conditions. I think Ada has more libraries than Eiffel does. Does anyone even write Eiffel? I am really curious if it is still alive anywhere, in some form.
Languages are software products like everything else in computing, either they evolve or they whither and die.
C especially was designed with lots of security defects, and had it not been for UNIX being available for free, it would probably never taken off.
More likely they evolve AND they whither and die. The number of software I have stopped using due to bad updates is much higher than those with not enough updates.
Truly I agree, but if we can add features to improve C codebases without rewriting them then that's a win, and you can just ignore them if you don't like them (as I will), but to the people where this has benefit they can be used.
Java 24 and C# 9 resemble little of their first versions. C++ might as well not even be the same language at this point. Why are we so conservative with C but then so happily liberal with every other language?
The complexity of C# and C++ should be a warning, not something to strive towards. C++ has 3 reasonable implementations, C has hundreds, for all sorts of platforms, where you don't get anything else.
Most C developers don't want a modern C, they want a reliable C. WG14 should be pushing for clarifications on UB, the memory and threading model, documenting where implementations differ, and what parts of the language can be relied and what not.
Nobody really needs a new way to do asserts, case ranges, or a new way to write the word "NULL".
> The complexity of C# and C++ should be a warning, not something to strive towards.
I think this talk about "complexity" is a red herring. C++ remains one of the most popular languages ever designed, and one of the key reasons is that since C++11 the standardization effort picked up steam and started including features that the developer community wanted and was eager to get.
I still recall the time that randos criticized C++ for being a dead language and being too minimalistic and spartan.
> C++ has 3 reasonable implementations, C has hundreds, for all sorts of platforms, where you don't get anything else.
I don't understand what point you are trying to make. Go through the list of the most popular programming languages, and perhaps half of them are languages which only have a single implementation. What compelled you to criticize C++ for having at least 3 production-quality implementations?
> Most C developers don't want a modern C, they want a reliable C.
You speak only for yourself. Your personal opinion is based on survivorship bias.
I can tel you that as a matter of fact a key reason why the likes of Rust took off was that people working with low-level systems programming were desperate for a C with better developer experience and sane and usable standard library.
> Nobody really needs a new way to do asserts, case ranges, or a new way to write the word "NULL".
Again, you speak for yourself, and yourself alone. You believe you don't need new features. That's fine. But you speak for yourself.
The vast majority of C programmers will agree that they don't care for any of the new features, as is clearly evident by the fact that almost nobody elects to use the latest standards.
The "most popular programming languages" are irrelevant here.
C and C++ are standardized languages, and also the tools we use for code that actually matters. A standard that can't be implemented is worthless, and even the "3 high quality" implementations of C/C++ haven't fully implemented the latest 2 editions of either language.
There's a lot more riding on these two languages than you give credit for, and they should be held to a higher standard. C is not the language to experiment with shiny new features, it's the language that works.
> I can tel you that as a matter of fact a key reason why the likes of Rust took off
So what's the problem? If Rust is gaining traction on C/C++, and people are excited about what it brings to the table, use it. We'll both do our thing, let it play out - we'll see which approach yields better software in 10 years.
> The vast majority of C programmers will agree that they don't care for any of the new features,(...)
I think this belief is based on faulty assumptions, such as survivorship bias.
C++ became popular largely because it started off by extending C with the introduction of important features that the developer community wanted to use. The popularity of C++ over C attests how much developers wanted to add features to C.
C++ also started being used over C in domains where it was not an excellent fit, such as embedded programming, because the C community prefered to deal with C++'s higher cognitive load as an acceptable tradeoff to leverage important features missing from C.
The success of projects such as Rust and even Zig, Nim also comes at the expense of C's inability to improve the developer experience.
Not to mention the fact that some projects are still developed in C because of a mix of inertia and lack of framework support.
So to claim that the C programmers do not want change, first you need to ignore the vast majority that do want but already dropped C in favor of languages that weren't frozen in time.
It's also unbelievable to claim that a language that precedes the concept of developer experience represents the apex of language design. This belief lands somewhere between Stockholm syndrome and being mentally constrained to not look beyond a tool.
C++ became popular because in the late 80s, 90% of programming was done on the PC. Zortech C++, with the first native compiler, provided the most powerful metal programming language available.
Zortech C++ is what gave C++ critical mass to succeed.
P.S. before ZTC++, the traffic in the usenet C++ newsgroup was neck-and-neck with the objective C newsgroup. After ZTC++ was released, traffic in the C++ newsgroup took off and the objective C one faded away. Borland saw our success, and pivoted away from their nascent attempt at an OOP language towards implementing Borland C++. Microsoft then also abandoned their OOP C project (called C) in favor of developing C++.
(I've never been able to get any information about C, I was just told about it by a Redmondian.)
> So to claim that the C programmers do not want change, first you need to ignore the vast majority that do want but already dropped C...
Good, we can ignore them. It's not a language for everybody, and if you're happily using C++, or Zig, or Nim, keep doing that.
Developer experience is a weigted sum of many variables. For you cool syntax features may play a huge role of that, for most C programmers a simple language with clear and understandable semantics is much more important.
There are many languages with cool syntax and shiny features, and very few of the latter kind. C belongs to the latter, and it also happens to be running a vast majority of the world's most important software.
You keep bringing up Rust as an example. It's probably the most famous of the new-age systems languages. If it's such a great language, when will we see a useful program written in it?
> Good, we can ignore them.
Who do you think you're representing? At best you only speak for yourself. It's perfectly fine if you choose to never update any tool you use, but that's just your personal opinion. You are free to stick with older standard versions of even compiler releases, but that is no justification to prevent everyone around you to improve their developer experience.
> It's not a language for everybody (...)
You might believe it isn't, but that's hardly a sane or rational belief.
Reality isn't on your side.
1. A lot of people use the old C standards.
2. Not a lot of people use the new ones.
3. A lot of useful software is written in C.
4. Not a lot of useful software is written in any of the other languages you've listed in this conversation, despite the fact that you can hardly call them "new" at this point.
I'm done with you, I'll leave you to puzzle out the obvious conclusion of these 4 points.You write software your way, I'll write it mine, and in 10 years we can check our homework. The first 10 years of Rust haven't really given us any results software-wise, but I'm sure with language design powerhouses such as yourself on the case, and just a few more pieces of syntax sugar, you can turn it around.
> If it's such a great language, when will we see a useful program written in it?
I think it should have been simple enough to find examples, though I suppose there might be some dependence on what you mean by "useful".
For standalone stuff, some examples might be Ripgrep, ruff, uv, Alacritty, and Polars. Rust is also used internally by some major companies, such as Amazon, Dropbox, Mozilla, Microsoft, Google, Volvo, Discord, and CloudFlare.
> there might be some dependence on what you mean by "useful".
I should've been clearer about that, but what I mean by that is pretty much what a normal non-technical person would consider an useful piece of software - Photoshop, Figma, Excel, Chrome, Windows, Android, Blender, AutoCAD, Unreal Engine, any Office Suite...
Since this is a technical forum I think we'd both easily agree on a bunch of very technically impressive software that the average person hasn't heard of - ffmpeg, qemu, LLVM, Linux, Postgres, V8, etc.
It would be a stretch to put any of the tool on either of those lists. Given the popularity of Rust, and that it's now over 10 years old, I'd expect at least one major program that can serve as an example of "here's this very useful, complex software package, as proof that our methodology works and you can do cool things this way."