The compiler is your best friend
blog.daniel-beskin.com150 points by based2 13 hours ago
150 points by based2 13 hours ago
> How many times did you leave a comment on some branch of code stating "this CANNOT happen" and thrown an exception? Did you ever find yourself surprised when eventually it did happen? I know I did, since then I at least add some logs even if I think I'm sure that it really cannot happen.
I'm not sure what the author expects the program to do when there's an internal logic error that has no known cause and no definite recovery path. Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.
At some level, the simplest thing to do is to give up and crash if things are no longer sane. After all, there's no guarantee that 'unreachable' recovery paths won't introduce further bugs or vulnerabilities. Logging can typically be done just fine within a top-level exception handler or panic handler in many languages.
Ideally, if you can convince yourself something cannot happen, you can also convince the compiler, and get rid of the branch entirely by expressing the predicate as part of the type (or a function on the type, etc.)
Language support for that varies. Rust is great, but not perfect. Typescript is surprisingly good in many cases. Enums and algebraic type systems are your friend. It'll never be 100% but it sure helps fill a lot of holes in the swiss cheese.
Because there's no such thing as a purely internal error in a well-constructed program. Every "logic error" has to bottom out in data from outside the code eventually-- otherwise it could be refactored to be static. Client input is wrong? Error the request! Config doesn't parse? Better specify defaults! Network call fails? Yeah, you should have a plan for that.
Not every piece of logic lends itself to being expressed in the type system.
Let's say you're implementing a sorting algorithm. After step X you can be certain that the values at locations A, B, and C are sorted such that A <= B <= C. You can be certain of that because you read the algorithm in a prestigious journal, or better, you read it in Knuth and you know someone else would have caught the bug if it was there. You're a diligent reader and you've convinced yourself of its correctness, working through it with pencil and paper. Still, even Knuth has bugs and perhaps you made a mistake in your implementation. It's nice to add an assertion that at the very least reminds readers of the invariant.
Perhaps some Haskeller will pipe up and tell me that any type system worth using can comfortably describe this PartiallySortedList<A, B, C>. But most people have to use systems where encoding that in the type system would, at best, make the code significantly less expressive.
Yes, this has been my experience too! Another tool in the toolbox is property / fuzz testing. Especially for data structures, and anything that looks like a state machine. My typical setup is this:
1. Make a list of invariants. (Eg if Foo is set, bar + zot must be less than 10)
2. Make a check() function which validates all the invariants you can think of. It’s ok if this function is slow.
3. Make a function which takes in a random seed. It initializes your object and then, in a loop, calls random mutation functions (using a seeded RNG) and then calls check(). 100 iterations is usually a good number.
4. Call this in an outer loop, trying lots of seeds.
5. If anything fails, print out the failing seed number and crash. This provides a reproducible test so you can go in and figure out what went wrong.
If I had a penny for every bug I’ve found doing this, I’d be a rich man. It’s a wildly effective technique.
Until you have a bit flip or a silicon error. Or someone changed the floating point rounding mode.
Sometimes the "error" is more like, "this is a case that logically could happen but I'm not going to handle it, nor refactor the whole program to stop it from being expressable"
>Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.
Not necessarily. Result types are explicit and require the function signature to be changed for them.
I would much prefer to see a call to foo()?; where it's explicit that it may bubble up from here, instead of a call to foo(); that may or may not throw an exception my way with no way of knowing.
Rust is absolutely not perfect with this though since any downstream function may panic!() without any indication from its function signature that it could do so.
> At some level, the simplest thing to do is to give up and crash if things are no longer sane.
The problem with this attitude (that many of my co-workers espouse) is that it can have serious consequences for both the user and your business.
- The user may have unsaved data - Your software may gain a reputation of being crash-prone
If a valid alternative is to halt normal operations and present an alert box to the user saying "internal error 573 occurred. please restart the app", then that is much preferred IMO.
Crashing is bad, but silently continuing in a corrupt state is much worse. Better to lose the last few hours of the user's work than corrupt their save permanently, for example.
> Your software may gain a reputation of being crash-prone
Hopefully crashing on unexpected state rather than silently running on invalid state leads to more bugs being found and fixed during development and testing and less crash-prone software.
> If a valid alternative is to halt normal operations and present an alert box to the user saying "internal error 573 occurred. please restart the app", then that is much preferred IMO.
You can do this in your panic or terminate handler. It's functionally the same error handling strategy, just with a different veneer painted over the top.
This is what rust's `unreachable()!` is for... and I feel hubris whenever I use it.
You should prefer to write unreachable!("because ...") to explain to some future maintenance engineer (maybe yourself) why you believed this would never be reached. Since they know it was reached they can compare what you believed against their observed facts and likely make better decisions.
But at least telling people that the programmer believed this could never happen short-circuits their investigation considerably.
A comment "this CANNOT happen" has no value on itself. Unless you've formally verified the code (including its dependencies) and have the proof linked, such comments may as well be wishes and prayers.
Yes, sometimes, the compiler or the hardware have bugs that violate the premises you're operating on, but that's rare. But most non pure algorithms (side effects and external systems) have documented failure cases.
> A comment "this CANNOT happen" has no value on itself.
I think it does have some value: it makes clear an assumption the programmer made. I always appreciate it when I encounter comments that clarify assumptions made.
>> A comment "this CANNOT happen" has no value on itself.
> I think it does have some value: it makes clear an assumption the programmer made.
To me, a comment such as the above is about the only acceptable time to either throw an exception (in languages which support that construct) or otherwise terminate execution (such as exiting the process). If further understanding of the problem domain identifies what was thought impossible to be rare or unlikely instead, then introducing use of a disjoint union type capable of producing either an error or the expected result is in order.
Most of the time, "this CANNOT happen" falls into the category of "it happens, but rarely" and is best addressed with types and verified by the compiler.
But if you spell that `assert(false)` instead of as a comment, the intent is equally clear, but the behavior when you're wrong is well-defined.
I agree that including that assert along with the comment is much better. But the comment alone is better than nothing, so isn't without value.
Better yet, `assert(false, message)`, with the message what you would have written in the comment.
`assert(false)` is pronounced "this can never happen." It's reasonable to add a comment with /why/ this can never happen, but if that's all the comment would have said, a message adds no value.
Oh I agree, literally `assert(false, "This cannot happen")` is useless, but ensuring message is always there encourages something more like, `assert(false, "This implies the Foo is Barred, but we have the Qux to make sure it never is")`.
Ensuring a message encourages people to state the assumptions that are violated, rather than just asserting that their assumptions (which?) don't hold.
what language are we talking about? If it's cpp then the pronounciation depends on compiler flags (perhaps inferred from CMAKE_BUILD_TYPE)
At least on iOS, asserts become no-ops on release builds
It really depends on the language you use. Personally I like the way rust does this:
- assert!() (always checked),
- debug_assert!() (only run in debug builds)
- unreachable!() (panics)
- unsafe unreachable_unchecked() (tells the compiler it can optimise assuming this is actually unreachable)
- if cfg!(debug_assertions) { … } (Turns into if(0){…} in release mode. There’s also a macro variant if you need debug code to be compiled out.)
This way you can decide on a case by case basis when your asserts are worth keeping in release mode.
And it’s worth noting, sometimes a well placed assert before the start of a loop can improve performance thanks to llvm.
> debug_assert!() (only run in debug builds)
debug_assert!() (and it's equivalent in other languages, like C's assert with NDEBUG) is cursed. It states that you believe something to be true, but will take no automatic action if it is false; so you must implement the fallback behavior if your assumption is false manually (even if that fallback is just fallthrough). But you can't /test/ that fallback behavior in debug builds, which means you now need to run your test suite(s) in both debug and release build versions. While this is arguably a good habit anyway (although not as good a habit as just not having separate debug and release builds), deliberately diverging behavior between the two, and having tests that only work on one or the other, is pretty awful.
I hear you, but sometimes this is what I want.
For example, I’m pretty sure some complex invariant holds. Checking it is expensive, and I don’t want to actually check the invariant every time this function runs in the final build. However, if that invariant were false, I’d certainly like to know that when I run my unit tests.
Using debug_assert is a way to do this. It also communicates to anyone reading the code what the invariants are.
If all I had was assert(), there’s a bunch of assertions I’d leave out of my code because they’re too expensive. debug_assert lets me put them in without paying the cost.
And yes, you should run unit tests in release mode too.
But how do you test the recovery path if the invariant is violated in production code? You literally can’t write a test for that code path…