Paper Tape Is All You Need – Training a Transformer on a 1976 Minicomputer

github.com

142 points by rahen 4 days ago


rahen - a day ago

Thanks for reposting! I'm the author of ATTN-11. Happy to answer any questions about the fixed-point arithmetic, the PDP-11 hardware, or the training process.

kristopolous - a day ago

I like how the author's "modern" machine to connect to it is still 20 years old.

With a concave trackpoint, respect.

BTW, I nag Framework at every conference I go to that people want this shell and keyboard. It's been years. I think it's time to go through the effort to figure out how to do the production run of the case myself. Framework actually wants people to do things like this but you know, manufacturing is hard. Anyone wanna help?

arglebarnacle - a day ago

Fascinating. We hear that the leaps in AI have been made possible by orders of magnitude increases in compute and data availability, and of course that’s substantially true—but exactly how true? It’s a nice exercise in perspective to see how much or how little modern machine learning methods would have been capable of if you brought them by time machine to the 70’s and optimized them for that environment.

tcdent - a day ago

5.5 min to train on a PDP/11 you mean to tell me we could have been doing this all along???

mkagenius - a day ago

I had the exact same idea but for AI agent harnesses.

I even created an app to explain it - https://news.ycombinator.com/item?id=47381803 (deleted the app as got no traction whatsoever)

Idea was that, the ai models like opus 4.6 and codex 5.4 have become so good at trying new ways to attack a problem, that even just Bash() tool is enough.

Continuing the idea, infact even File() operations are enough.

Again continuing the same line of thought, even just a Tape is enough. Given enough time, codex and opus will achieve your target.

ashwinnair99 - a day ago

The fact that it is possible at all says more about how simple transformers actually are underneath than it does about the hardware.

kmoser - a day ago

> I don't have an actual paper tape reader, so the object code is directly deposited in memory through the console.

So, really, a Turing Machine is all you need?

AnimalMuppet - a day ago

Woah. Dude has a running PDP-11/34 in 2026? Personally, I find that more impressive than the program.

ryguz - a day ago

[dead]