C++ proposal: There are exactly 8 bits in a byte

open-std.org

151 points by Twirrim 5 hours ago


favorited - 3 hours ago

Previously, in JF's "Can we acknowledge that every real computer works this way?" series: "Signed Integers are Two’s Complement" <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p09...>

pjdesno - 3 hours ago

During an internship in 1986 I wrote C code for a machine with 10-bit bytes, the BBN C/70. It was a horrible experience, and the existence of the machine in the first place was due to a cosmic accident of the negative kind.

WalterBright - 2 hours ago

D made a great leap forward with the following:

1. bytes are 8 bits

2. shorts are 16 bits

3. ints are 32 bits

4. longs are 64 bits

5. arithmetic is 2's complement

6. IEEE floating point

and a big chunk of wasted time trying to abstract these away and getting it wrong anyway was saved. Millions of people cried out in relief!

Oh, and Unicode was the character set. Not EBCDIC, RADIX-50, etc.

MaulingMonkey - 3 hours ago

Some people are still dealing with DSPs.

https://thephd.dev/conformance-should-mean-something-fputc-a...

Me? I just dabble with documenting an unimplemented "50% more bits per byte than the competition!" 12-bit fantasy console of my own invention - replete with inventions such as "UTF-12" - for shits and giggles.

harry8 - 3 hours ago

Is C++ capable of deprecating or simplifying anything?

Honest question, haven't followed closely. rand() is broken,I;m told unfixable and last I heard still wasn't deprecated.

Is this proposal a test? "Can we even drop support for a solution to a problem literally nobody has?"

jfbastien - 33 minutes ago

Hi! Thanks for the interest on my proposal. I have an updated draft based on feedback I've received so far: https://isocpp.org/files/papers/D3477R1.html

TrueDuality - 4 hours ago

This is both uncontroversial and incredibly spicy. I love it.

kazinator - 39 minutes ago

What will be the benefit?

- CHAR_BIT cannot go away; reams of code references it.

- You still need the constant 8. It's better if it has a name.

- Neither the C nor C++ standard will be simplified if CHAR_BIT is declared to be 8. Only a few passages will change. Just, certain possible implementations will be rendered nonconforming.

- There are specialized platforms with C compilers, such as DSP chips, that are not byte addressable machines. They are in current use; they are not museum pieces.

kreco - 3 hours ago

I'm totally fine with enforcing that int8_t == char == 8-bits, however I'm not sure about spreading the misconception that a byte is 8-bits. A byte with 8-bits is called an octet.

At the same time, a `byte` is already an "alias" for `char` since C++17 anyway[1].

[1] https://en.cppreference.com/w/cpp/types/byte

kazinator - 28 minutes ago

There are DSP chips that have C compilers, and do not have 8 bit bytes; smallest addressable unit is 16 (or larger).

Less than a decade ago I worked with something like that: the TeakLite III DSP from CEVA.

bawolff - an hour ago

> We can find vestigial support, for example GCC dropped dsp16xx in 2004, and 1750a in 2002.

Honestly kind of surprised it was relavent as late as 2004. I thought the era of non 8-bit bytes was like 1970s or earlier.

bobmcnamara - 3 hours ago

I just put static_assert(CHAR_BITS==8); in one place and move on. Haven't had it fire since it was #if equivalent

JamesStuff - 4 hours ago

Not sure about that, seems pretty controversial to me. Are we forgetting about the UNIVACs?

donatj - 3 hours ago

So please do excuse my ignorance, but is there a "logic" related reason other than hardware cost limitations ala "8 was cheaper than 10 for the same number of memory addresses" that bytes are 8 bits instead of 10? Genuinely curious, as a high-level dev of twenty years, I don't know why 8 was selected.

To my naive eye, It seems like moving to 10 bits per byte would be both logical and make learning the trade just a little bit easier?

pabs3 - 2 hours ago

Hmm, I wonder if any modern languages can work on computers that use trits instead of bits.

https://en.wikipedia.org/wiki/Ternary_computer

IAmLiterallyAB - 44 minutes ago

I like the diversity of hardware and strange machines. So this saddens me. But I'm in the minority I think.

throwaway889900 - 4 hours ago

But how many bytes are there in a word?

aj7 - 3 hours ago

And then we lose communication with Europa Clipper.

lowbloodsugar - 19 minutes ago

  #define SCHAR_MIN -127
  #define SCHAR_MAX 128
Is this two typos or am I missing the joke?
masfuerte - 3 hours ago

This is entertaining and probably a good idea but the justification is very abstract.

Specifically, has there even been a C++ compiler on a system where bytes weren't 8 bits? If so, when was it last updated?

whatsakandr - 2 hours ago

Hoesntly at thought this might be an onion headline. But then I stopped to think about it.

DowsingSpoon - 3 hours ago

As a person who designed and built a hobby CPU with a sixteen-bit byte, I’m not sure how I feel about this proposal.

MrLeap - an hour ago

How many bytes is a devour?

gafferongames - 3 hours ago

Amazing stuff guys. Bravo.

starik36 - 3 hours ago

There are FOUR bits.

Jean-Luc Picard

Quekid5 - 4 hours ago

JF Bastien is a legend for this, haha.

I would be amazed if there's any even remotely relevant code that deals meaningfully with CHAR_BIT != 8 these days.

(... and yes, it's about time.)

hexo - 3 hours ago

Why? Pls no. We've been told (in school!) that byte is byte. Its only sometimes 8bits long (ok, most of the time these days). Do not destroy the last bits of fun. Is network order little endian too?

scosman - 3 hours ago

Bold leadership

adamnemecek - 3 hours ago

Incredible things are happening in the C++ community.

cyberax - 4 hours ago

But think of ternary computers!

bmitc - 3 hours ago

Ignoring this C++ proposal, especially because C and C++ seem like a complete nightmare when it comes to this stuff, I've almost gotten into the habit of treating a "byte" as a conceptual concept. Many serial protocols will often define a "byte", and it might be 7, 8, 9, 11, 12, or whatever bits long.

AlienRobot - 3 hours ago

I wish I knew what a 9 bit byte means.

One fun fact I found the other day: ASCII is 7 bits, but when it was used with punch cards there was an 8th bit to make sure you didn't punch the wrong number of holes. https://rabbit.eng.miami.edu/info/ascii.html

CephalopodMD - 3 hours ago

Obviously

38 - 3 hours ago

the fact that this isn't already done after all these years is one of the reasons why I no longer use C/C++. it takes years and years to get anything done, even the tiniest, most obvious drama free changes. contrast with Go, which has had this since version 1, in 2012:

https://pkg.go.dev/builtin@go1#byte

Iwan-Zotow - 4 hours ago

In a char, not in a byte. Byte != char

electricdreams - 4 hours ago

[dead]