map::operator[] should be nodiscard

quuxplusone.github.io

84 points by jandeboevrie 4 months ago

For the Rust inclined, [[nodiscard]] is #[must_use], if you were confused.

Anyway, this article illustrates a great reason why C++ is a beautiful mess. You can do almost anything with it, and that comes at a cost. It's the polar opposite ethos of "there should be one clear way to do something" and this sort of thing reminds me why I have replaced all of my systems language needs with Rust at this point, despite having a very long love/hate relationship with both C and C++.

Totally agree it should be marked as nodiscard, and the reasoning for not doing so is a good example of why other languages are taking over.

m-schuetz - 3 months ago

I'm not a fan of nodiscard because it's applied way too freely, even if the return value is not relevant. E.g. WebGPU/WGSL initially made atomics nodiscard simply because they return a value, but half the algorithms that use atomics only do so for the atomic write, without needing the return value. But due to nodiscard you had to make a useless assignment to an unused variable.
- fingerlocks - 3 months ago
  
  Disagree. This is the default in Objective-C & Swift, and it’s great. You have to explicitly annotate when discards are allowed. It’s probably my all time favorite compiler warning in terms of accidental-bugs-caught-early, because asking the deeper question about why a function returns useless noise invites deep introspection.
bayesnet - 3 months ago

It’s also worth noting that in rust you don’t need to be as worried about marking a function #[must_use] if there is a valid reason some of the time to discard the value. One can just assign like so `let _ = must_use_fn()` which discards the value and silences the warning. I think this makes the intent more clear than casting to void as TFA discusses.
- on_the_train - 3 months ago
  
  There is in c++, too (std::ignore). Not sure why the author decided to go with the ancient void cast
  - vitus - 3 months ago
    
    std::ignore's behavior outside of use with std::tie is not specified in any finalized standard.
    https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p29... aims to address that, but that won't be included until C++26 (which also includes _ as a sibling commenter mentions).
  - aw1621107 - 3 months ago
    
    I believe C++26 now allows _ as a placeholder name [0]:
    > We propose that when_ is used as the identifier for the declaration of a variable, non static class member variable, lambda capture or structured binding. the introduced name is implicitly given the [[maybe_unused]] attribute.
    > In contexts where the grammar expects a pattern matching pattern,_ represents the wildcard pattern.
    Some of the finer details (e.g., effect on lifetime and whether a _ variable can be used) differ, though.
    [0]: https://github.com/cplusplus/papers/issues/878
    
    tialaramex - 3 months ago
    
    Specifically, Rust's _ is not a variable, it is a pattern that matches anything and so let _ = isn't an assignment it's specifically the explicit choice not to assign this value. If we wrote a "dummy" variable the compiler is forbidden from dropping the value early, that "dummy" is alive until it leaves scope, whereas if we never assigned the value it's dropped immediately.
    In modern Rust you don't need the let here because you're allowed to do the pattern match anywhere, and as I said _ is simply a pattern that matches anything. So we could omit the let keyword, but people don't.
  - quuxplusone - 3 months ago
    
    https://quuxplusone.github.io/blog/2022/10/16/prefer-core-ov... does not (yet) include this as one of its examples, but it could (and someday might).
- OptionOfT - 3 months ago
  
  Careful. I don't like the use of `let _ =` in general, as it instantly drops.
  And you don't get a compilation failure when the type on the right hand side changes. This is even more important if it switches from non-drop to drop. There is a clippy lint for this.
  - junon - 3 months ago
    
    Good point. You can always explicitly type the underscore. Clippy even has `let_underscore_untyped` for this.
    Never thought about it, but I'll be adding this to my standard list of deny lints now.
pjmlp - 3 months ago

I disagree with the conclusion, other languages are taking over because they have the advantage of not having 40 years of production code history, and those adopting them don't care about existing code.
You will find similar examples in Python, Java, C#,... and why not everyone is so keen into jumping new language versions.
the_mitsuhiko - 3 months ago

Interestingly Index::index is also usually not marked as `#[must_use]` in Rust either.
- junon - 3 months ago
  
  I don't believe you can mark trait methods with #[must_use] - it has to be on the implementation. Not near a compiler to check at the moment.
  In the case of e.g. Vec, it returns a reference, which by itself is side-effect free, so the compiler will always optimize it. I do agree that it should still be marked as such though. I'd be curious the reasons why it's not.
  - steveklabnik - 3 months ago
    
    This is just my take, but I think historically the Rust team was hesitant to over-mark things #[must_use] because they didn't want to introduce warning fatigue.
    I think there's a reasonable position to take that it was/is too conservative, and also one that it's fine.
  - the_mitsuhiko - 3 months ago
    
    But it's also not marked at the implementation for HashMap's Index impl for instance.
    
    tialaramex - 3 months ago
    
    This didn't seem like a footgun to me, hats["Jim"]; will panic if, in fact "Jim" isn't one of the keys, but what did the hypothetical author expect to happen when they write this? HashMap doesn't implement IndexMut so hats["Jim"] = 26; won't even compile.
    
    - 3 months ago
    
    [deleted]
- OptionOfT - 3 months ago
  
  Unsure if I misunderstand:
  Index returns a reference:
  https://doc.rust-lang.org/std/ops/trait.Index.html#:~:text=s...
  If you don't use the reference it just ... disappears.
  Am I missing something here?
  - junon - 3 months ago
    
    Technically without any optimizations this would result in a stray LEA op or something but any optimizing compiler (certainly any that support Rust) would optimize it out even at low levels of debug settings.

nialv7 - 3 months ago

C++ operator [] is poorly designed: index, and index+assignment should be two different operators, and indexing alone should never insert new entries into the map.

Languages like D [0] or Rust [1] get this right.

[0]: https://dlang.org/spec/operatoroverloading.html#index_assign... [1]: https://doc.rust-lang.org/std/ops/trait.IndexMut.html

gpderetta - 3 months ago

You could (and I would) make the opposite statement: upsert should be the default operator and if you want lookup only or insert only you call different operators.
I find it annoying that I often have to reach to defaultdict in Python to get this behavior.
- tialaramex - 3 months ago
  
  C++ could offer the entry API here, so you can get back a type representing the result of finding where this key would go, and then either it has a key+value pair you can mutate if you want, or it has a blank state allowing you to write a new key+value pair if that's what you want, without redoing the potentially expensive find operation to figure out where to put the new/updated pair
- hmry - 3 months ago
  
  I certainly use defaultdict often in Python too, but not more often than the regular dict. Maybe 90% dict and 10% defaultdict. So from my POV lookup only should definitely be the default.
Tempest1981 - 3 months ago
iirc, there are 5 ways to put something into a std::map
```
  operator[], insert(), emplace(), try_emplace(), insert_or_assign()
```
And 2 of them don't overwrite an existing value.
Lots of people are surprised that insert() can fail. And even more surprised that a RHS [] inserts a default value. I'm not a fan of APIs that surprise.
- ruszki - 3 months ago
  
  > I'm not a fan of APIs that surprise.
  It really depends on what you got used to. C++ was the first language, which, I would say, learned. I’m still surprised to day that traversing a set is not in order by default.
  In this specific case, emplace should be your default option, and you should really know why that’s the case, and why you have this many options.

reactordev - 3 months ago

My pet peeve with c++ is exactly this. Either it’s not wise to call release, or it is (under circumstances) and yet the developer has no idea whether their scenario applies (tip: it doesn’t, 90% of the time).

The stdlib is so bloated with these “Looks good, but wait” logic bombs.

I wish someone would just draw a line in the sand and say “No, from here on out, this is how this works and there are no other scenarios in which there needs a work around”. This is why other systems languages are taking off (besides the expressiveness or memory safety bandwagon) is because there are clear instructions in the docs on what this does with examples of how to use it properly.

Most c++ codebases I’ve seen the last 10 years are decent (a few are superb) and I get that there’s old code out there but at what point do we let old dogs die?

GuB-42 - 3 months ago

C++ has always been a "kitchen sink" language, it is used in many different ways and drawing any line may alienate an entire industry.
> This is why other systems languages are taking off
Great! It is not a competition. If you think that Rust is a better choice, use Rust, don't make C++ into Rust. Or maybe try Carbon, it looks like it is the language you want. But if you have some old dogs you want to keep alive, then use C++, that's what it is for.
- reactordev - 3 months ago
  
  I get it, I do. There’s a lot of old code out there. My point wasn’t that old dogs are bad. My point was about changing how we care for them.
  If you have old code that you want to compile, use -c98 or whatever to peg it to that. Leave the rest of us alone to introduce more modern ways of things. I’d even be happy to see removal of things.
feelamee - 3 months ago

> This is why other systems languages are taking off (besides the expressiveness or memory safety bandwagon) is because there are clear instructions in the docs on what this does with examples of how to use it properly.
I have never seen better documentation for programming languages than cppreference. Can you list such docs?
pjmlp - 3 months ago

> This is why other systems languages are taking off
For the time being that are still being written with C++ infrastructure though.
It would be great if those wannabe C++ replacements were fully bootstraped.
- reactordev - 3 months ago
  
  Go compiles go, not sure what you mean by wannabe c++.
  There’s a frontend to gcc for go and working on rust. Is it the use of gcc you dislike? You’re going to have to explain some more.
  We’re stuck on ASM/ELF. We’re stuck on C of some kind. Maybe in the future LLMs can help us write low-level / high expressiveness code but until we get rid of 1970s “personal computer” decisions in silicon, we’re stuck with it.
  - throwaway17_17 - 3 months ago
    
    What is the proposed replacement for ASM (in particular) and C in the context of the bootstrapping process? Then why lump ELF (unless you don’t mean the executable format) in with the low level language?
    Historically, pjmlp has pushed very strongly for languages attempting to take the place of C and C++ at the infrastructure layer can not claim to have supplanted those two until their own compiler and related infrastructure is not dependent on C++ (LLVM in particular). I tend to sympathize with this view, it is really hard to take a language’s claim to have supplanted C++ and be the only fit for use language going forward, but then is dependent on millions of lines of the languages they disparage.
    As a counter however, it’s rather difficult to expect a language to overcome thousands of person-years of work on a compiler like LLVM and tens of thousands of person-years on Linux. The newer languages should be able to make an articulate case that throwing away so much work is not a viable approach and just use what exists now but keep the new languages on all greenfield projects.
    
    steveklabnik - 3 months ago
    
    As another counter, Rust has never claimed to have "supplanted" C++. So holding it to that standard is holding it to a non-goal for itself in the first place.
    
    pjmlp - 3 months ago
    
    You see it all over the place, not the point of Rust core team, but certainly from the kind of posts that gave origin to Rust Freedom Force memes.
    
    throwaway17_17 - 3 months ago
    
    tl;dr - Rust wants to be a foundational language of the computing stack and holding it to the standard of being bootstrapped, instead of relying on another language, is a reasonable critique.
    Clearly Rust the language and the associated organization would not make that claim. Particularly where supplanted is a past tense verb and indicates that it is a completed project.
    However, despite overblown complaints about the RESF, the community both in commentary and in practice has been extremely vocal that any language that does not have Rust’s memory safety model is not suitable for any new project or further use in existing projects. And while the RIIR meme is for the most part a message board strawman, again, the community surrounding Rust is busy reimplementing coreutils in Linux, putting Rust in the kernel, and rewriting the userland executables that most Linux workflows are based around (ripgrep being the most successful in this group).
    It is clear that Rust, the community of users (if not the language as an independent entity) clearly wants to supplant both C and C++ at all levels of the computing stack. The push for Rust in the Linux kernel is enough evidence to support the concept at the most pervasive level.
    Continuous references to wide spread adoption and endorsements by ‘big tech’ is used to frame Rust as the only viable option going forward. Blog posts, Reddit threads, and board comments all routinely take the stance that memory-safety (as defined by Rust) is ‘table stakes’ for any development occuring in current year.
    It feels disingenuous to pretend that Rust is not trying to become the industry standard language in the way C and C++ is today and has been for multiple decades. And given that aim, I think talking about Rust, as the name for both the language and its community of users and supporters, is working to supplant C and C++.
    Given all that, I find it fair to discuss the fact that while busy trying to maneuver itself into every space in the tech industry (from embedded all the way up to the front end for web web apps) and find some success in doing so Rust is still reliant on C++ infrastructure particularly for compilation. I was responding to a pair of comments about the desire to see languages that want to be the bedrock of the computing stack bootstrapped. I think Rust absolutely wants to be such a bedrock language and as such, I don’t think wanting it to be bootstrapped and not reliant on the C++ it want to replace is an unreasonable standard to hold the language to.
    
    steveklabnik - 3 months ago
    
    "Rust is a good language and we should use it to write software" is not the same thing as "lol C++ sucks and nobody should use it for anything ever."
    Engineering is all about tradeoffs. Responding to perceived zealotry with more, but different, zealotry makes it harder to have actual discussions.
    LLVM is best in class at what it does. Until someone else decides to make something like LLVM in Rust, it's not realistic to use something else. That's just engineering. The choices here directly refute these sorts of zealotry claims, that is, it's not incoherence in what's being done, it's that you are attributing something to a large group of people who have a wide variety of beliefs. Overall, people are more pragmatic than you're giving them credit for, that's why rustc uses LLVM.
    
    LegionMammal978 - 3 months ago
    
    Is offering an alternative to LLVM not precisely one of the purposes of the rustc_codegen_cranelift backend [0]? It still doesn't have 100% feature parity, but I believe it's able to fully bootstrap the compiler at this point. Writing a rustc backend isn't trivial, but it isn't as impossible as you make it out to be.
    [0] https://github.com/rust-lang/rustc_codegen_cranelift
    
    throwaway17_17 - 3 months ago
    
    I’m not sure what I wrote to give the impression that Rust was unable to write a compiler, let alone implied it was impossible. Rust is certainly full featured enough to write a very well performing compiler. I find my comment more an indictment, and viewed uncharitably an accusation of hypocrisy, of the language org’s oversight that they are so heavily invested in LLVM (but if I was leveling such an accusation it would not be just because it’s a C++ project)
    My comment was focused on the fact that Rust is not using a Rust compiler and therefore is relying on deep and complex C++ infrastructure while working to supplant the same at the lowest levels of the computing stack.
    I was also commenting, up the thread, in a chain of comments about a perceived shortcoming of Rust’s implementation (i.e. it’s not being bootstrapped) and why some people view that as a negative.
    
    Dylan16807 - 3 months ago
    
    Doing a few things at a time is hardly an indictable offense. Self-compilation doesn't have to be anywhere near the start of the todo list. Relying on C++ infrastructure at compile time isn't a problem until you're trying to make all-purpose C++-free installs, and that's an entirely optional goal. The important part is having the runtime written in Rust.
    Pointing out a language still needs C++ at compile time is a reasonable critique of "supplanting C++", but it's not a reasonable critique of "wanting to be a foundational language of the computing stack". Rust is the latter. (And even then it's too early to worry about compilers.) (And Rust is making good progress on compilers anyway.)
    
    tialaramex - 3 months ago
    
    All of the front-end is in fact pure Rust, I know that because I am one of the huge number of authors. The backend, thus the code generation and many optimisations of the sort AoCO is about is LLVM.
    We absolutely know that if Rust didn't offer LLVM you'd see C++ people saying "Rust doesn't even have a proper optimiser". So now what you're asking for isn't just "a Rust backend" which exists as others have discussed, but a Rust alternative to LLVM, multiple targets, lots of high quality optimisations, etc. presumably as a drop-in or close approximation since today people have LLVM and are in production with the resulting code.