OpenAI unveils its first custom chip, built by Broadcom
techcrunch.com800 points by jamdesk a day ago
800 points by jamdesk a day ago
Announcement: https://openai.com/index/openai-broadcom-jalapeno-inference-...
https://decrypt.co/371971/openai-broadcom-jalapeno-first-cus...
https://www.cnn.com/2026/06/24/tech/openai-broadcom-jalapeno...
> Developed from design to production in nine months, accelerated by OpenAI’s models > the use of OpenAI models to accelerate parts of the design and optimization process. I wish there was more about this. As is I kind of have to assume that this is just meaningless marketing, like saying development was accelerated by Microsoft Office or their 5k LG Ultrafine 40-inch monitors. Like, if this was as big a deal as it kind of vaguely implies, they would be making a bigger deal of it, right? Chip CEO here. It really depends on what "design" or "production" means. Does "design" mean that the design was complete? Does "production" mean the beginning of production, i.e. tapeout? If measuring from RTL-freeze to tapeout, this is a fairly typical (even somewhat unimpressive) timeline (accounting for some unexpected issues) for a large, complex 3nm chip. If measuring from concept (no RTL at all, block diagram of architecture) to tapeout, this is an amazing timeline. The truth is probably somewhere in between. A more concrete statement would use actual technical milestones and gates. Not a chip CEO, but I read this article and thought that they're working on some kind of application specific chip only for serving models. Similar to how an FPGA can optimize certain tasks. Given constant weights / biases of a Transformer / DNN you could use pipelining to feed forward calculations through the array one layer at a time. For DNN's with thousands of layers you might see 1:1 speed up per layer channel. I doubt they would undergo this process for marginal gains. With a striking lack of numbers, I'm not confident. I my experience, everything underspecified in a marketing release is unflattering. They're also not a chip designing company, but they're probably trying to keep up on the eyes of investors. As the article mentions, several of their competitors are chip designers and already have working procuction inference chips. When you have a few billion dollars you can hire chip people and partner with a chip company. That's not to say I expect they'll ship something competitive with Google's custom AI hardware on the first go, since Google has been at it for quite a while, but there's very few technical problems large sums of money won't solve. Yeah, I'm not sure how competitive it is without any specs. Just from it being "inference only" that puts it on the same level as Google's 2015 TPUv1. i don't understand what the second paragraph is saying. In very crude terms, AFAICT, if you have a bunch of matrix multiplications, but one of matrices (the one with model weights) doesn't change, you can seriously speed up the computation. One thing is that you don't need to re-fetch the elements of the constant matrix, you can keep it near the ALUs. Then you maybe can detect and ignore sparse / empty blocks by marking them once. IDK how the custom hardware exploits this; would love to hear any ideas! > IDK how the custom hardware exploits this; would love to hear any ideas! You might like this article [1], titled "FPGA-based CNN Acceleration using
Pattern-Aware Pruning". More context and details can be found in the PhD thesis of Léo Pradels [2]. [1]: https://inria.hal.science/hal-04689673/document [2]: https://theses.hal.science/tel-05021575v1/file/PRADELS_Leo.p... Random thought. Once models stabilise, could you possibly hardcode the model in gates? Or are they too large for a single chip? wow if they can get something like this working, what happens to all this infrastructure? Hyperscalers have to be assuming the lifespan of that stuff wrong considering the next gen will be 1000x more efficient. The question isn’t whether it works (it does); the question is whether there are buyers for hardware that is obsolete the day it ships. Models evolve much more quickly than hardware can keep up.
sharkjacobs - a day ago
zgao - a day ago
otterdude - a day ago
kmacdough - 13 hours ago
SwellJoe - 13 hours ago
IX-103 - 10 hours ago
xdavidliu - a day ago
nine_k - a day ago
guyomes - a day ago
cm2187 - a day ago
lsaferite - 18 hours ago
jwHollister - 17 hours ago
otterley - 16 hours ago