Xr0 verifier, guarantee the safety of C programs at compile time

114 points by Alifatisk 3 months ago

Previously:

Show HN: Xr0 – Vanilla C Made Safe with Annotations (https://news.ycombinator.com/item?id=37536186, 2023-09-16, 13 comments)

Xr0 Makes C Safer than Rust (https://news.ycombinator.com/item?id=39858240, 2024-03-28, 41 comments)

Xr0: C but Safe (https://news.ycombinator.com/item?id=39936291, 2024-04-04, 144 comments)

Show HN: Xr0 is a Static Debugger for C (https://news.ycombinator.com/item?id=40472051, 2024-05-05, 4 comments)

tialaramex - 3 months ago

Thanks! I thought this felt familiar. 2023 is a long time to still be in this "Initial signs are promising" mode. There are no code commits since March 2025.
I've had projects which stalled for a few months or even a year, but generally if I said I'll get to it "soon" and two years later I haven't, that's not getting done. There are two boxes of books I planned to unbox "soon" after discovering that the bookshelves for my new flat were full & so I had nowhere to put them. That was when Obama was still President of the United States of America. Don't expect me to ever get around to unboxing those books, I clearly haven't even missed them.
- muldvarp - 3 months ago
  
  The initial sign for Xr0 never seemed promising for anyone with experience in formal verification. Neither the code nor the ideas they presented were new. I asked them multiple times to clarify how their project differed from the dozens of already existing options for formal verfication of C programs and never got a concrete answer.
  - tgv - 3 months ago
    
    As I see it: tracking (de)allocation in a very simple, understandable way. Unfortunately, that seems to be all it does. It's a start, certainly if you don't want to/cannot use a more complete system, since they can be quite complex. I'm not following this space professionally, only out of interest a bit, but do you know of a system that is so simple?
    
    muldvarp - 3 months ago
    
    Xr0 isn't any simpler than for example Frama-C. In fact one of the simplest (but still useful) systems for statically tracking ownership is Rusts borrow checker, which the authors of Xr0 say is _too_ simple.
- rurban - 3 months ago
  
  Well, apparently someone rewrote it in rust, which was better than their version. Which they needed some time to process how to continue.
  And they got stuck with the bounds checker and loops.
  But other such checkers are far more advanced, with a better contract syntax.

Animats - 3 months ago

Cute. They've introduced ownership to C. The annotation system is expressing ownership, in a rather clunky way. You have to explicitly declare that functions create or release allocated space. That's ownership tracking.

But "We currently model all buffers as infinitely-sized. We will be adding out-of-bounds checking soon." That's the hard problem.

rwmj - 3 months ago

Oddly they didn't even reuse the annotation for this that GCC already has: https://stackoverflow.com/questions/18485447/gcc-attribute-m...
LiamPowell - 3 months ago

> But "We currently model all buffers as infinitely-sized. We will be adding out-of-bounds checking soon." That's the hard problem.
It's not particularly difficult for the prover. You essentially just need to do a translation from C in to your ATP language of choice, with a bunch of constraints that check undefined behaviour never occurs. Tools such as Frama-C and CBMC have supported this for a long time.
The difficult part is for the user as they need to add assertions all over the place to essentially keep the prover on the right track and to break down the program in to manageable sections. You also want a bunch of tooling to make things easier for the user, which is a problem that can be as difficult as you want it to be since there's always going to be one more pattern you can detect and add a custom handler/error message for.
- Animats - 3 months ago
  
  > It's not particularly difficult for the prover.
  I know that. Did that decades ago. I wanted to see how they did the annotations and proofs. What can you express? Partially filled arrays are moderately hard. Sparsely initialized arrays need a lot of scaffolding. This is an area where most "safe C" variants have foundered.
1718627440 - 3 months ago

> Cute. They've introduced ownership to C.
Ownership semantics are described in every serious C interface. Linters for checking it have also existed for decades. I find the notion that Rust invented it to be incredible stupid. Rust just has different ownership semantics and makes it an enforced part of the language (arguable a good idea). And yes they of course also do bounds-checking.
- pjmlp - 3 months ago
  
  Rust did not invent affine types of course, those that affirm that lack background in type systems.

WalterBright - 3 months ago

From the tutorial:

    #include <stdio.h>
    int main()
    {
        int i;
        return i;
    }
    the behaviour is undefined because i’s value is indeterminate.

D solves that problem by initializing variables with the default initializer (in this case 0) if there is not an explicit initializer.

That was in the first version of D because I have spent days tracking down an erratic bug that turned out to be an uninitialized variable.

dadoum - 3 months ago

I think that the right thing to do is to error out though. When the behaviour of some code cannot be guaranteed, that code should just be ruled out imo. Manually initializing a variable generally doesn't clutter the code, arguably it's making it clearer.
- Alifatisk - 3 months ago
  
  Dart does this, you can mark a variable as "late" which tells the compiler that you know for certain the variable will be written too before read. If something reads the variable before it's initialized, then the runtime will error. Maybe even on compile time if it can be caught, I am not certain.
rwmj - 3 months ago

You can do that in GCC now (-ftrivial-auto-var-init=zero). There are a few projects that decided to enable this by default.
- WalterBright - 3 months ago
  
  Should add it to the C standard.
pseudohadamard - 3 months ago

Pretty much any existing C compiler will also solve that problem by telling you that it's uninitialized.
- WalterBright - 3 months ago
  I tried:
  int test() { int i; return i; }
  using clang on my Mac mini, and:
  clang -c test.c
  and it compiled without complaint.
  - pseudohadamard - 3 months ago
    
    I always build with -Wall so I'm used to seeing the warning:
    > clang -Wall test.c test.c:4:16: warning: variable 'i' is uninitialized when used here [-Wuninitialized] 4 | return i; | ^ test.c:3:14: note: initialize the variable 'i' to silence this warning 3 | int i; | ^ | = 0 1 warning generated.
    For the oldest compiler I have access to, VC++ 6.0 from 1998:
    warning C4700: uninitialized local variable 'i' used
    
    WalterBright - 3 months ago
    
    The trouble with warnings is every compiler has a different set of warnings. It balkanizes the language. Many D features are the result of cherry picking warnings from various compilers and making them standard features.
    
    pseudohadamard - 3 months ago
    
    I see this as a feature, not a bug. I compile my code with a range of different compilers because each one catches things the others don't. The creeping gcc-isation of everything is what I'd see as a bug.
    
    1718627440 - 3 months ago
    
    > It balkanizes the language.
    Not really, as C has had even more diverse implementations per-standardization. I would say the situation is now, much less diverse under the rule of GCC and Clang. (Yeah MSVC also exists.)
    
    WalterBright - 3 months ago
    
    Every switch that changes the language semantics creates a separate language. If you have n such switches, your compiler is supporting n x n languages. I've also had troubles writing portable C code with all warnings enabled as different compilers contradicted each other on what was acceptable.
    I tried pretty hard to make D a warning-less language, but still some crept in grump grump.
    Have fun with this one:
    for (int i = 0; i < end; ++i); foo(i);
    One of the best programmers I know came up to me with this loop and told me my C compiler was broken because the loop was only executed once. I pointed at the ; and you can guess the rest.
    I added a warning for that in the C compiler, and for D simply disallowed it. I've noticed that some C compilers have since added a warning for that as well. The C folks should just make it illegal.
    I've also fixed printf in D so that:
    char* p; printf("%d\n", p);
    gives an error message, and the right format to use for `p`. It was a little thing, but it sure found a lot of incorrect formats in my code.
    
    1718627440 - 3 months ago
    
    > The C folks should just make it illegal.
    I often have code, which looks like this:
    for (ptr = start; random_condition (*ptr); ptr = ptr->next); for (ptr = ptr->next; other_condition (*ptr); ptr = ptr->prev); ... [do action] for (ptr = end; to_be_deleted (*ptr) && (delete (ptr), TRUE); ptr = ptr->prev);
    I wouldn't be happy about your policy.
    > I've also fixed printf in D so that [...] gives an error message
    Just last week I had the case that the C compiler complained, I should use %lld for long long, but the printf implementation shipped with the compiler doesn't support that. Thus, using %ld, even if undefined behaviour was the correct action. I wouldn't like my language making up more work for me for no reason.
    
    WalterBright - 3 months ago
    
    Rewrite as:
    for (ptr = start; random_condition (*ptr); ptr = ptr->next) { }
    Then anyone reading your code will know the empty loop was intentional. BTW, many C compilers warn about the ; on a for loop.
    Have you ever discovered this bug:
    if (condition); doThis();
    It's a hard one to see in a code review. Yes, that's also disallowed in D.
    > I should use %lld for long long, but the printf implementation shipped with the compiler doesn't support that.
    Weird that your compiler would support long long, but with a broken Standard library. I don't see that as a feature. You can always cast the argument to long, and at least you won't get undefined behavior.