Banned C++ features in Chromium
chromium.googlesource.com211 points by szmarczak 20 hours ago
211 points by szmarczak 20 hours ago
Nothing particularly notable here. A lot of it seems to be 'We have something in-house designed for our use cases, use that instead of the standard lib equivalent'.
The rest looks very reasonable, like avoiding locale-hell.
Some of it is likely options that sand rough edges off of the standard lib, which is reasonable.
It's weird to me, as the former lead maintainer of this page for ten years or so, that this got submitted to both r/c++ and HN on the same day. Like... what's so exciting about it? Was there something on the page that caught someone's eye?
> We have something in-house designed for our use cases, use that instead of the standard lib equivalent
Yea, you encounter this a lot at companies with very old codebases. Don't use "chrono" because we have our own date/time types that were made before chrono even existed. Don't use standard library containers because we have our own containers that date back to before the STL was even stable.
I wonder how many of these (or the Google style guide rules) would make sense for a project starting today from a blank .cpp file. Probably not many of them.
The majority of things Chromium bans would still get banned in green-field use.
Some notable exceptions: we'd have allowed std::shared_ptr<T> and <chrono>. We might also have allowed <thread> and friends.
For the containers in particular this makes a lot of sense because the C++ stdlib containers are just not very good. Some of this is because C++ inherited types conceived as pedagogic tools. If you're teaching generic programming you might want both (single and double) extrusive linked list types for your explanation. But for a C++ programmer asking "Which of these do I want?" the answer is almost always neither.
The specification over-specifies std::unordered_map so that no good modern hash table type could implement this specification, but then under-specifies std::deque so that the MSVC std::deque is basically useless in practice. It requires (really, in the standard) that std::vector<bool> is a bitset, even though that makes no sense. It sometimes feels as though nobody on WG21 has any idea what they're doing, which is wild.
Linked lists used to be more efficient than dynamic arrays — 40 years ago, before processors had caches.
Intrusive linked lists still firmly have a place in modern code, for reasons other than performance. I don’t know many good reasons for extrusive linked lists, even before caches. There might be a few, but a dynamic array is (and has always been?) usually preferable to an extrusive list.
You're as usual spitting out nonsense. Containers are just fine, as a matter of fact they suit their purpose very well for 99% of the software out there. Do you ever get tired bashing on the exact same topic? I see you're relentlessly doing it for years here on HN. How bitter one has to be ...
Don't use standard library containers because we have our own containers that date back to before the STL was even stable.
Flashback to last job. Wrote their own containers. Opaque.
You ask for an item from it, you get back a void pointer. It's a pointer to the item. You ask for the previous, or the next, and you give back that void pointer (because it then goes through the data to find that one again, to know from where you want the next or previous) and get a different void pointer. No random access. You had to start with the special function which would give you the first item and go from there.
They screwed up the end, or the beginning, depending on what you were doing, so you wouldn't get back a null pointer if there was no next or previous. You had to separately check for that.
It was called an iterator, but it wasn't an iterator; an iterator is something for iterating over containers, but it didn't have actual iterators either.
When I opened it up, inside there was an actual container. Templated, so you could choose the real inside container. The default was a QList (as in Qt 4.7.4). The million line codebase contained no other uses; it was always just the default. They took a QList, and wrapped it inside a machine that only dealt in void pointers and stripped away almost all functionality, safety and ability to use std::algorithm
I suspect but cannot prove that the person who did this was a heavy C programmer in the 1980s. I do not know but suspect that this person first encountered variable data type containers that did this sort of thing (a search for "generic linked list in C" gives some ideas, for example) and when they had to move on to C++, learned just enough C++ to recreate what they were used to. And then made it the fundamental container class in millions of lines of code.
time to refactor the code base so this tumor can be deleted?
The complete refactor, bringing it forwards from VS2008 to VS2022, and from a home-built, source-code edited Qt 4.7.4 to Qt 6.something, took about two years from start to finish.
> home-built, source-code edited Qt 4.7.4
That's scarier than the containter craziness you mention
> I wonder how many of these (or the Google style guide rules) would make sense for a project starting today from a blank .cpp file. Probably not many of them.
The STL makes you pay for ABI stability no matter if you want it or not. For some use cases this doesn't matter, and there are some "proven" parts of the STL that need a lot of justification for substitution, yada yada std::vector and std::string.
But it's not uncommon to see unordered_map substituted with, say, sparsehash or robin_map, and in C++ libraries creating interfaces that allow for API-compatible alternatives to use of the STL is considered polite, if not necessarily ubiquitous.
> I wonder how many of these (or the Google style guide rules) would make sense for a project starting today from a blank .cpp file. Probably not many of them.
That also depends on how standalone the project is. Self-contained projects may be better off with depending on standard library and popular third-party libraries, but if a project integrates with other internal components, it's better to stick to internal libraries, as they likely have workarounds and special functionality specific to the company and its development workflow.
I'd argue that the optimum was in long run to migrate to the standard version, that everyone (e.g. new employees) know. Replacing the usually particular (or even weird) way implemented own flavour.
I know, I know, long run does not exists in today's investor dominated scenarios. Code modernization is a fairytale. So far I seen no exception in my limited set of experiences (but with various codebases going back to the early 90's with patchy upgrades here and there, looking like and old coat fixed many many times with diverse size of patches of various materials and colour).
[flagged]
Look, I even share your language preference but this is still unnecessary.
Are there really any good reasons to start a brand new project in c++ though? No one who can write modern c++ has any trouble with rust in my experience, and all the other common options are even quicker to pick up. Creating bindings isn't hard anymore if your niche library doesn't have any yet. Syntactic preference I guess, but neither c++ or rust are generally considered elegant or aesthetic choices.
Because "brand new" doesn't mean devoid of context. Within your domain, there will still be common libraries, interfaces, and tools.
C++ is very flexible, with a lot of very mature tooling and incredibly broad platform support. If you're writing some web server to run on the hardware of your choosing, then sure, that doesn't matter. But if you're writing something deeply integrated with platform/OS interfaces, or graphics, or needs to support some less common platforms, then C++ is often your only practical option for combining expressiveness and performance.
This is the sort of info I was trolling for, but what are those platforms and os? Targets llvm doesn't handle yeah c++ makes sense, or c. A sibling mentions xcode, which makes sense. Graphics seems questionable, vulkan support is fine. Windows support has seemed finetoo, the same gui has worked as what we wrote for Linux.
Dependencies. There are billions of lines of C++ out there that have been optimized and production hardened over decades that you might want to reuse. Rust lang interoperability with anything but C sucks in practice.
Unreal, Godot, CryEngine, DirectX, PlayStation, Switch, XBox, CUDA, SYSCL, LLVM, GCC, V8,...
Yes, there are plenty of domains where Rust has zero ecosystem.
Not to mention that Rust advocates keep forgetting their compiler is partially written in C++ (LLVM/GCC).
Maybe, maybe not. But either way it's just plain rude to charge into a C++ thread to drop a comment saying how the language sucks and you should use (insert other language) instead.
Rust becomes a significant burden if you need a GUI or hardware-accelerated graphics.
C++ isn't much better for GUI.
C++ was the GUI king during the 1990's, and none of the Rust toolkits is half as good as the surviving frameworks, like C++ Builder, Qt, wxWidgets, heck even MFC has better tooling.
In addition to other reasons given: If you have a team of C++ developers, let them use the language they know best.
Yes. If you're targeting Apple platforms and want to allow clients to use your product in Xcode (the common case) or even need Swift/ObjC interop yourself, using rust or anything not explicitly supported by Apple and Xcode is just too fiddly.
(Shrug) If I want Rust, I'll feed my C++ to an LLM and tell it to port it to Rust. Since we've been assured that Rust magically fixes everything that's wrong, bad, or unsafe about C++, this seems like a sound approach.
We probably aren't that far off actually. Even taking asm with no symbols back into rust works well. You have truth, just have the agent repeat until the asm matches. Doesn't work on giant codebases, but on a few functions it absolutely does. And while the llm may get the algorithm wrong, the type system does seem to help it generate useful code in rust for a starting place v
Yeah, but then just let the agent generate proper C++ code, contrary to an human it doesn't forget about best practices, or how ownership is supposed to be handled.
Except the llm forgets about that in rust too, then the agent looks at the ownership errors from the previous iteration and fixes them.
You missed the other take, with AI assisted coding, you can stay in C++, as it will take care everything is coded with enough care.
Or why bother with Rust, when the LLM gets to generate C++ code with best practices.
While I like Rust, I think AI as the next abstraction step in programming has kind of taken its relevance away, when computer assisted programming is part of the workflow.
Yeah, good point, I don't know how I missed that possibility.
/s of course... for now, but not for long.
Strange. I wouldn’t trust the output of a coding agent and I would want stronger review of its output. If it passes a strict compiler that gives me more confidence than if it passed a lax one.
But sure, if you trust it to have written C++ to a higher standard than the experts, then go for it.
Somewhat notable is that `char8_t` is banned with very reasonable motivation that applies to most codebases:
> Use char and unprefixed character literals. Non-UTF-8 encodings are rare enough in Chromium that the value of distinguishing them at the type level is low, and char8_t* is not interconvertible with char* (what ~all Chromium, STL, and platform-specific APIs use), so using u8 prefixes would obligate us to insert casts everywhere. If you want to declare at a type level that a block of data is string-like and not an arbitrary binary blob, prefer std::string[_view] over char*.
`char8_t` is probably one of the more baffling blunders of the standards committee.
there is no guarantee `char` is 8 bits, nor that it represents text, or even a particular encoding.
If your codebase has those guarantees, go ahead and use it.
> there is no guarantee `char` is 8 bits, nor that it represents text, or even a particular encoding.
True, but sizeof(char) is defined to be 1. In section 7.6.2.5:
"The result of sizeof applied to any of the narrow character types is 1"
In fact, char and associated types are the only types in the standard where the size is not implementation-defined.
So the only way that a C++ implementation can conform to the standard and have a char type that is not 8 bits is if the size of a byte is not 8 bits. There are historical systems that meet that constraint but no modern systems that I am aware of.
[1] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n49...
Don't some modern DSPs still have 32bit as minimum addressable memory? Or is it a thing of the past?
char8_t also isn't guaranteed to be 8-bits, because sizeof(char) == 1 and sizeof(char8_t) >= 1. On a platform where char is 16 bits, char8_t will be 16 bits as well
The cpp standard explicitly says that it has the same size, typed, signedness and alignment as unsigned char, but its a distinct type. So its pretty useless, and badly named
wtf
It is pretty consistent. It is part of the C Standard and a feature meant to make string handling better, it would be crazy if it wasn't a complete clusterfuck.
There's no guarantee char8_t is 8 bits either, it's only guaranteed to be at least 8 bits.
> There's no guarantee char8_t is 8 bits either, it's only guaranteed to be at least 8 bits.
Have you read the standard? It says: "The result of sizeof applied to any of the narrow character types is 1." Here, "narrow character types" means char and char8_t. So technically they aren't guaranteed to be 8 bits, but they are guaranteed to be one byte.
Yes, but the byte is not guaranteed to be 8 bits, because on many ancient computers it wasn't.
The poster to whom you have replied has read correctly the standard.
What platforms have char8_t as more than 8 bits?
Well platforms with CHAR_BIT != 8. In c and c++ char and there for byte is atleast 8 bytes not 8 bytes. POSIX does force CHAR_BIT == 8. I think only place is in embeded and that to some DSPs or ASICs like device. So in practice most code will break on those platforms and they are very rare. But they are still technically supported by c and c++ std. Similarly how c still suported non 2's complement arch till 2023.
That's where the standard should come in and say something like "starting with C++26 char is always 1 byte and signed. std::string is always UTF-8" Done, fixed unicode in C++.
But instead we get this mess. I guess it's because there's too much Microsoft in the standard and they are the only ones not having UTF-8 everywhere in Windows yet.
std::string is not UTF-8 and can't be made UTF-8. It's encoding agnostic, its API is in terms of bytes not codepoints.
Of course it can be made UTF-8. Just add a codepoints_size() method and other helpers.
But it isn't really needed anyway: I'm using it for UTF-8 (with helper functions for the 1% cases where I need codepoints) and it works fine. But starting with C++20 it's starting to get annoying because I have to reinterpret_cast to the useless u8 versions.
Related: in C at least (C++ standards are tl;dr), type names like `int32_t` are not required to exist. Most uses, in portable code, should be `int_least32_t`, which is required.
How many non-8-bit-char platforms are there with char8_t support, and how many do we expect in the future?
Mostly DSPs
Is there a single esoteric DSP in active use that supports C++20? This is the umpteenth time I've seen DSP's brought up in casual conversations about C/C++ standards, so I did a little digging:
Texas Instruments' compiler seems to be celebrating C++14 support: https://www.ti.com/tool/C6000-CGT
CrossCore Embedded Studio apparently supports C++11 if you pass a switch in requesting it, though this FAQ answer suggests the underlying standard library is still C++03: https://ez.analog.com/dsp/software-and-development-tools/cce...
Everything I've found CodeWarrior related suggests that it is C++03-only: https://community.nxp.com/pwmxy87654/attachments/pwmxy87654/...
Aside from that, from what I can tell, those esoteric architectures are being phased out in lieu of running DSP workloads on Cortex-M, which is just ARM.
I'd love it if someone who was more familiar with DSP workloads would chime in, but it really does seem that trying to be the language for all possible and potential architectures might not be the right play for C++ in 202x.
Besides, it's not like those old standards or compilers are going anywhere.
Green Hills Software's compiler supports more recent versions of C++ (it uses the EDG frontend) and targets some DSPs.
Back when I worked in the embedded space, chips like ZSP were around that used 16-bit bytes. I am twenty years out of date on that space though.
Cadence DSPs have C++17 compatible compiler and will be c++20 soon, new CEVA cores also (both are are clang based). TI C7x is still C++14 (C6000 is ancient core, yet still got c++14 support as you mentioned). AFIR Cadence ASIP generator will give you C++17 toolchain and c++20 is on roadmap, but not 100% sure.
But for those devices you use limited subset of language features and you would be better of not linking c++ stdlib and even c stdlib at all (so junior developers don't have space for doing stupid things ;))
> but it really does seem that trying to be the language for all possible and potential architectures might not be the right play for C++ in 202x.
Portability was always a selling point of C++. I'd personaly advise those who find it uncomfortable, to choose a different PL, perhaps Rust.
> Portability was always a selling point of C++.
Judging by the lack of modern C++ in these crufty embedded compilers, maybe modern C++ is throwing too much good effort after bad. C++03 isn't going away, and it's not like these compilers always stuck to the standard anyway in terms of runtime type information, exceptions, and full template support.
Besides, I would argue that the selling point of C++ wasn't portability per se, but the fact that it was largely compatible with existing C codebases. It was embrace, extend, extinguish in language form.