Four Column ASCII (2017)
garbagecollected.org189 points by tempodox 2 days ago
189 points by tempodox 2 days ago
For me was interesting that all digits in ASCII starts with 0x3, eg. 0x30 - 0, 0x31 - 1, ..., 0x39 - 9. I thought it was accidental, but in real it was intended. This was giving possibility to build simple counting/accounting machines with minimal circuit logic with BCD (Binary Coded Decimals). That was wow for me ;)
And this is exactly why I find the usual 16x8 at least as insightful as this proposed 32x4 (well, 4x32, but that's just a rotation).
This is by design, so that case conversion and folding is just a bit operation.
The idea that SOH/1 is "Ctrl-A" or ESC/27 is "Ctrl-[" is not part of ASCII; that idea comes from they way terminals provided access to the control characters, by a Ctrl key that just masked out a few bits.
I guess it's an age thing, but I thought this was really basic CS knowledge. But I can see why this may be much less relevant nowadays.
Yes, the diagram just shows the ASCII table for the old teletype 6-bit code (and 5-bit code before), with the two most significant bits spread over 4 columns to show the extension that happened while going 5→6→7 bits. It makes obvious what was very simple bit operations on very limited hardware 70–100 years ago.
(I assume everybody knows that on mechanical typewriters and teletypes the "shift" key physically shifted the caret position upwards, so that a different glyph would be printed when hit by a typebar.)
For whatever reason, there are extraordinarily few references that I come back to over and over, across the years and decades. This is one of them.
Tangentially related, there is much insight about Unix idioms to be gained from understanding the key layout of the terminal Bill Joy used to create vi
Some of this elegance discussed from a programmatic point of view
I still find weird that they didn't make A,B... just after the digits, that would make binary to hexadecimal conversion more efficient..
Going off the timelines on Wikipedia, the first version of ASCII was published (1963) before the 0-9,A-F hex notation became widely used (>=1966):
- https://en.wikipedia.org/wiki/ASCII#History
- https://en.wikipedia.org/wiki/Hexadecimal#Cultural_history
The alphanumeric codepoints are well placed hexadecimally-speaking though. I don't imagine that was just an accident. For example, they could've put '0' at 050/0x28, but they put it at 060/0x30. That seems to me that they did have hexadecimal in consideration.
It's a binary consideration if you think of it rather than hexadecimal.
If you have to prominently represent 10 things in binary, then it's neat to allocate slot of size 16 and pad the remaining 6 items. Which is to say it's neat to proceed from all zeroes:
x x x x 0 0 0 0
x x x x 0 0 0 1
x x x x 0 0 1 0
....
x x x x 1 1 1 1
It's more of a cause for hexadecimal notation than an effect of it.Currently 'A' is 0x41 and 0101, 'a' is 0x61 and 0141, and '0' is 0x30 and 060. These are fairly simple to remember for converting between alphanumerics and their codepoint. Seems more advantageous, especially if you might be reasonably looking at punchcards.
I'm not sure if our convention for hexadecimal notation is old enough to have been a consideration.
EDIT: it would need to predate the 6-bit teletype codes that preceded ASCII.
If Ctrl sets bit 6 to 0, and Shift sets bit 5 to 1, the logical extension is to use Ctrl and Shift together to set the top bits to 01. Surely there must be a system somewhere that maps Ctrl-Shift-A to !, Ctrl-Shift-B to " etc.
It's more that shift flips that bit. Also I'd call them bit 0 and 1 and not 5 and 6 as 'normally' you count bits from the right (least significant to most significant). But there are lots of differences for 'normal' of course ('middle endian' :-P )
Also easy to see why Ctrl-D works for exiting sessions.
This is also why the Teletype layout has parentheses on 8 and 9 unlike modem keyboards that have them on 9 and 0 (a layout popularised by the IBM Selectric). The original Apple IIs had this same layout, with a “bell” on top of the G.
Modern keyboards = some keyboards. In the Nordic Countries modern keyboards have parantheses on 8 and 9.
What happened to this block and the keyboard key arrangement?
ESC [ { 11011
FS \ | 11100
GS ] } 11101
Also curious why the keys open and close braces, but ... the single and double curly quotes don't open and close, but are stacked. Seems nuts every time I type Option-{ and Option-Shift-{ …> What happened to this block and the keyboard key arrangement?
There's also these:
| ASCII | US keyboard |
|------------+-------------|
| 041/0x21 ! | 1 ! |
| 042/0x22 " | 2 @ |
| 043/0x23 # | 3 # |
| 044/0x24 $ | 4 $ |
| 045/0x25 % | 5 % |
| | 6 ^ |
| 046/0x26 & | 7 & |You're no longer talking about ASCII. ASCII has only a double quote, apostrophe (which doubles as a single quote) and backtick/backquote.
Note on your Mac that the Option-{ and Option-}, with and without Shift, produce quotes which are all distinct from the characters produced by your '/" key! They are Unicode characters not in ASCII.
In the ASCII standard (1977 version here: https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub1-2-197...) the example table shows a glyph for the double quote which is vertical: it is neither an opening nor closing quote.
The apostrophe is shown as a closing quote, by slanting to the right; approximately a mirror image of the backtick. So it looks as though those two are intended to form an opening and closing pair. Except, in many terminal fonts, the apostrophe is a just vertical tick, like half of a double quote.
The ' being veritcal helps programming language '...' literals not look weird.
Related. Others?
Four Column ASCII (2017) - https://news.ycombinator.com/item?id=21073463 - Sept 2019 (40 comments)
Four Column ASCII - https://news.ycombinator.com/item?id=13539552 - Feb 2017 (68 comments)
where does this character set come from? It looks different on xterm.
for x in range(0x0,0x20): print(chr(x),end=" ")
What are you trying to achieve, none of those characters are printable, and definetly not going to show up on the web.
for x in range(0x0,0x20): print(f'({chr(x)})', end =' ')
(0|) (1|) (2|) (3|) (4|) (5|) (6|) (7|) (8) (9| ) (10|
) (11|
) (12|
) (14|) (15|) (16|) (17|) (18|) (19|) (20|) (21|) (22|) (23|) (24|) (25|) (26|␦) (27|8|) (29|) (30|) (31|)Just asking why they have different icons in different environments? Maybe it is UTF-8 vs ISO-8859?
They shouldn't show as visual representations, but some "ASCII" charts show the IBM PC character set instead of the ASCII set. IIRC, up to 0xFF UTF-8 and 8859 are very close with the exceptions being the UTF-8 escapes for the longer characters.
Opera AI solved the problem:
If you want to use symbols for Mars and Venus for example,they are not in range(0,0x20). They are in Miscellanous Symbols block.
On early bit-paired keyboards with parallel 7-bit outputs, possibly going back to mechanical teletypes, I think holding Control literally tied the upper two bits to zero. (citation needed)
Also explains why there is no difference between Ctrl-x and Ctrl-Shift-x.