Nvidia is proposing a beast of a CPU system for Windows PCs

69 points by tosh 4 hours ago

The Qualcomm Snapdragon X2 Elite Extreme trounces Nvidia's chip in single core CPU performance. It beats Intel and AMD's best, too. It's the only CPU in the same league as Apple's M-series in both CPU performance and power efficiency. And it's available in laptops today, not later this year. People are sleeping on Qualcomm.

captainregex - 13 minutes ago

nice to know we’ve got QC employees in the comments! very convenient. I think you forgot to include TM at the end of the name tho
Danox - 34 minutes ago

Microsoft is sleeping on Qualcomm with their lousy port of Windows to Arm processors…
- reactordev - 33 minutes ago
  
  10000000x this. They have been sleeping on Arm since windows phone. I just don’t see them ever having an original thought again.
  They could have had a 128core arm chip by now.
  - adabyron - a minute ago
    
    They have original thoughts! It's just that those employees get squashed by other divisions or having to meet short term quarterly profits it seems.
    There's also the whole giant trillion dollar company doesn't want to invest and let small ideas grow. They only focus on things that move the needle, which isn't much at the size.
    Had Microsoft executed and invested, they could have made a come back imo in both search, mobile & hardware. Unfortunately major lack of leadership or they just don't want those areas.
- criticalfault - 30 minutes ago
  
  and is Qualcomm is sleeping on Linux?
  - embedding-shape - 27 minutes ago
    
    Seems like not? Judging based on https://github.com/qualcomm-linux something is happening, although I can't say how much. They definitively seem awake at least.
    
    justincormack - 11 minutes ago
    
    They run a hypervisor under the OS, and dont support actually running directly on the hardware, its very odd.
    
    jeroenhd - 13 minutes ago
    
    The problem with these chips on Linux is that something has been happening for months but you still end up needing to download special editions of ARM Linux images to get these devices to work properly.
    Some distros still need extracting Qualcomm firmware from Windows to get Linux to work properly. Audio remains a challenge, like x86 Linux decades ago. Apparently camera stuff works these days but produces images of subpar quality.
    These issues also occur on normal Linux. My experience with my Lenovo+Intel laptop was that it took three months after release for the firmware to work properly (and the Nvidia drivers took much longer, but that's my fault for buying something containing Nvidia hardware). Intel managed to do what Qualcomm did in months rather than years.
    I hope Qualcomm finally sorts this shit out, I really do, but with the prices of computers these days, I'm going to need to see quite the discount before I'll consider buying anything with a Snapdragon.
  - stefan_ - 11 minutes ago
    
    Yes, Ubuntu on the previous gen Snapdragon X is still trash.
darkwater - 29 minutes ago

Is it well supported under Linux?
- diabllicseagull - 9 minutes ago
  
  I've been keeping an eye on the state of Linux on the first gen of X Elite and it's sad that the potential is not fully materialized outside WoA. Take a look at what peeps are going through:
  https://discourse.ubuntu.com/t/ubuntu-concept-snapdragon-x-e...
- modeless - 24 minutes ago
  
  Qualcomm has been upstreaming Linux support for some of their chips but they're not working fast enough and I don't think the latest chips are there yet unfortunately.
bradfa - 40 minutes ago

Qualcomm is a “fool me once, shame on you, fool me twice, you don’t fool me twice” kind of situation. So many horrible experiences in the past that people are going to be hesitant.
Qualcomm are trying harder now it seems. But it will take time to repair their reputation in the PC market.
- thewebguyd - 22 minutes ago
  
  They burned me with the first gen Snapdragon X Elite. Before the various laptops with it were out they promised Linux support. Here we are, years alter, still no fully OOTB support. Ironically, the GPU firmware were just mainlined in the kernel 4 months ago, but they still haven't done the same for the 1st gen X elite.
  Tuxedo computers tried and didn't succeed either.
  I will never buy Qualcomm again. I avoid them on phones as well by just buying Apple. They do not support their hardware beyond the release.
  - jeroenhd - 10 minutes ago
    
    > I avoid them on phones as well by just buying Apple
    To each their own, but I don't recall Apple ever mainlining any of their drivers on Linux. You're rightfully angry on the laptop side of things, but Apple is much worse than Qualcomm when it comes to open source support for their phones.
    Qualcomm probably shouldn't have promised Linux support in the first place. Everyone seems to love Apple's hardware even though you're practically stuck with macOS. Had Qualcomm just stuck to Windows-only, they would've probably received a much better reception by the tech press.
    
    mlinhares - 4 minutes ago
    
    Apple doesn’t sell general purpose computers outside of their own hardware so this doesn’t make any sense.
    
    dismalaf - 8 minutes ago
    
    At least Apple tells you they don't support anything except their own OS, Qualcomm just pretends to offer support.
- derefr - 23 minutes ago
  
  Can you say more? I don't have any memory of Qualcomm-related scandals(?), but I just read the news; I've never really been a user of their chips.
dismalaf - 13 minutes ago

Too bad Qualcomm provides shit drivers for Linux, never updates any of their drivers (had a Samsung/Qualcomm phone with drivers years behind the equivalent Google Pixel phone), etc... They are the absolute worst actor in the entire computing world, don't care how fast their chip is.

infecto - 3 hours ago

"I am not sure how many people will run AI models locally. It still seems like a niche application to me. However, it will make decent machines to play video games."

I don't know who will be the winner but with some of the recent releases from gemma it seems more probable that you may run some models locally if only from a cost perspective, not even considering business security. Not sure how this type of architecture would make for good gaming though, puts into question the whole statement.

"Ranked in the top 2% of scientists globally (Stanford/Elsevier 2025) and among GitHub's top 1000 developers" - side note but this guy puts this everywhere, gives me probably the inverse of what he is marketing for.

jb1991 - 20 minutes ago

He’s just a braggart. When you see something like this in somebody’s personal bio on social media, it’s basically a banner that means “take everything I say in the context of me promoting myself.”
root-parent - 3 hours ago

"I am not sure how many people will run AI models locally. It still seems like a niche application to me. However, it will make decent machines to play video games..."
This is the 2026 edition of Ken Olsen: "There is no reason anyone would want a computer in their home"
- throw0101a - 2 hours ago
  
  > This is the 2026 edition of Ken Olsen: "There is no reason anyone would want a computer in their home"
  Digging into this:
  > In conclusion, there is evidence that Ken Olsen did doubt the need for computers in the home, but the evidence is based primarily on the testimony of David Ahl who was perturbed when the personal computer project he championed at DEC was not supported by Olsen in 1974.
  > Olsen’s resistance may have been similar to that expressed by another DEC executive, Gordon Bell. In 1980 Bell thought home terminals would act as gateways to remote computers which would provide appropriate services.
  * https://quoteinvestigator.com/2017/09/14/home-computer/
  It was supposedly said in 1977: most computers at that time were not small, and so it would not be surprising that people would not expect the general public to desire a large, power-hungry, noise-y apparatus in their house.
  - kristov - 7 minutes ago
    
    We kinda ended up with terminals connected to mainframes anyway. The terminal being the web browser, and the mainframe being SaS. So it wasn't that far off.
  - wslh - 19 minutes ago
    
    The simple explanation is that predicting the future is generally impossible. It doesn't matter if it's Olsen or anybody else.
  - parineum - an hour ago
    
    It doesn't really need this much explanation.
    People take these quotes out of context all the time. Said in a business context, there was no need, at that time, for someone to have a personal computer.
    There's no business justification in 1977 for a personal computer department at a business. It's similar to the gates quote about RAM (I think it was 64KB?).
    These statements aren't meant to be forever quotes. Their business plan quotes.
    
    michaelcampbell - 24 minutes ago
    
    > It's similar to the gates quote about RAM (I think it was 64KB?)
    640, and Bill Gates said he either never said that, or at least never remembered having said it. I think there is no evidence anywhere that he did.
    https://www.computerworld.com/article/1563853/the-640k-quote...
    
    glimshe - an hour ago
    
    Or maybe they simply made a mistake. Big deal. This doesn't speak negatively of his other achievements.
    
    shermantanktop - 43 minutes ago
    
    He had a long career and presumably many successes, and is fallible like the rest of us. But a half-remembered zinger with no context makes for zippier posts I guess.
    The early popularity of Minitel, the continued popularity of ssh/tmux, and the web browser itself indicates that bespoke client applications are not the only way. He wasn’t directionally wrong.
- fg137 - 37 minutes ago
  
  You seriously think running LLM is the same thing as general computing?
- AaronAPU - 3 hours ago
  
  That’s too strong of an assertion.
  Local models aren’t deterministically equivalent in capabilities to foundation models. Home computers are turing complete; just like a mainframe. They are just slower. Often not slower enough to matter.
  - sandworm101 - 2 hours ago
    
    Most people are ok with slower. An AI that lets you edit a family picture, in say 30 seconds, locally is preferable to one that is instantaneous but requires you to submit that picture to examination/storage/training/sale in someone else's AI ecosystem. If i want to crop my ex out of family photos, i should not have to first give that photo to Microsoft. If want an LLM to write a book report for me, i dont want it also alerting my school. And if i write a memo for a client, and i want an LLM to check the spelling, i dont want that memo leaked either.
    
    parineum - an hour ago
    
    > Most people are ok with slower. An AI that lets you edit a family picture, in say 30 seconds, locally is preferable to one that is instantaneous but requires you to submit that picture to examination/storage/training/sale in someone else's AI ecosystem.
    Maybe if you ask them that question, but if you show them two products, they'll definitely prefer the faster one. 30 seconds is a long time to watch a progress bar.
    
    spwa4 - 17 minutes ago
    
    Plus there's the other question. If this thing is slower ... what's the price? The desktop/mini-pc version of this is $3000, after all. At this performance level what is an acceptable price for the laptops?
    People definitely aren't going to accept more expensive + slower ...
    
    Pxtl - 28 minutes ago
    
    I'd like to think so but the existence of Google and Apple and Microsoft's cloud based photo tools with phone integration suggests that's false.
    You could run a pretty good home server on $50 of gear and yet we never saw any real adoption of OwnCloud/NextCloud style products as an alternative to Google Drive/Photos or Apple Cloud.
    Why should LLM/Transformers be any different? Especially when you need a proper expensive GPU to run them instead of a Raspberry Pi?
    
    thewebguyd - 17 minutes ago
    
    Apple's photo tools run on device, and they'll probably ship more on device foundation models at WWDC too.
    On-device AI is going to be important, I think. It doesn't have to take the form of a chatbot UI to be useful.
- joering2 - an hour ago
  
  or "640K ought to be enough for anybody."
  - shermantanktop - 39 minutes ago
    
    https://quoteinvestigator.com/2011/09/08/640k-enough/
    Nobody ever said that, at least not as an assertion or prediction. The actual instances of similar language are from multiple people describing their earlier thoughts before they learned it wasn’t true.
  - throw1234567891 - 39 minutes ago
    
    There’s no public proof this has ever been said, and if it was, if it was not taken out of context.
smcleod - 41 minutes ago

Qwen 3.6 is far ahead of Gemma for most (but not all) things. I've deployed it out across a number of M5 MacBooks and it's genuinely useful for many tasks. It won't replace an Opus or current gen Sonnet sized model but it's still amazingly good for its size and probably as good as or just a bit before Sonnet 4 era. Far more reliable for tool calling, coding, agentic tasks and faster than the Gemma models especially with MTP.
- Pxtl - 33 minutes ago
  
  I've got a Qwen 3.5 running on a 12GB 3060 and it's dumb as a stump but still smart enough to get some useful work done. Since it's my daily driver desktop I havent jumped to 3.6 since last time I did I quickly ran out of vram and locked the desktop environment.
  But yeah, the Qwen line is pretty impressive on commodity hardware.
  - derefr - 17 minutes ago
    
    I must be using LLMs very differently than y'all, because I can't think of a single thing I would rely on an LLM that's "dumb as a stump" to do for me.
    To me, LLMs are for asking research questions + exploring design spaces + pointing at codebases to investigate bugs. And those all benefit from the model being as "smart" (in terms of both fluid intelligence and burned-in knowledge) as possible.
    I'm guessing there exist problems where "intelligence past a certain point" doesn't matter, so these medium-sized models can match the performance of the bigger models. But what problems might those be?
GeekyBear - 9 minutes ago

> However, it will make decent machines to play video games."
Where you will need games to be rewritten for ARM to get full performance, just like on Apple's M series chips.
falsemyrmidon - 29 minutes ago

> this guy puts this everywhere, gives me probably the inverse of what he is marketing for.
Do you think he's in mensa too?
flatline - an hour ago

The HN crowd is, by and large, not the target audience for his self promotion. I guarantee there is one and this is more or less effective.
bespokedevelopr - an hour ago

The security aspect is the main driver why I’m seeing so many businesses investing in local hardware. They know the models aren’t as good (caveat that they also can’t run Chinese models) and that’s ok. Places that really care about security and data governance already aren’t on the bleeding edge. They wait for the nice stable lts version, they lock down dev machines in frustrating ways and have lots of IT admin layers.
But they also want to taste the sweet fruit of AI so the only way to do this that a CISO will approve is on local air gapped hardware. It’s a niche but still a billion dollar niche.
- thewebguyd - 15 minutes ago
  
  Microsoft is working on this with their new execution containers (https://github.com/microsoft/mxc)
unmole - 3 hours ago

> you may run some models locally if only from a cost perspective
I have a hard time believing running a model on a laptop will be cheaper than running it in a datacenter. Why wouldn't economies of scale apply here as with every other computation?
- itishappy - 4 minutes ago
  
  It's cheaper for the AI provider to use your laptop instead of their datacenter.
- dgellow - 3 hours ago
  
  A laptop is really a pretty bad form factor to run LLMs. Worst cooling, more expensive memory that you cannot replace, resell value depreciating fast. It’s fine for tinkering, small scale research, and demos but it’s definitely niche.
  The vision NVIDIA is selling is pure marketing IMHO
- wazdra - 3 hours ago
  
  This is assuming that you'll be priced the fraction of computing that you consumed. But you are actually paying for their infrastructure, for the R&D (and also the computation that went into training the model) etc. It is not clear that, for your own small computations, this kind of costs are needed, but you will still pay your share in the investment the provider made so that they could serve everyone's computation needs.
  - hungryhobbit - an hour ago
    
    But, currently ... you're not. AI companies are operating at a loss, and are being subsidized by their investors.
    Local may or may not be cheaper than remote now, depending on the details, but the factors you describe won't affect the math nearly as much as they will once that subsidization ends.
  - wjnc - 2 hours ago
    
    In that analogy bigtech AI is currently investing in cleaner air for all of us? We _could_ breath it through their hose, but might as well breath it outside.
- jerf - an hour ago
  
  What "every other computation"? I seem to have a lot processing power at my disposal here, between my cell phones, laptops, gaming PCs, various other hardware devices.
  You're going to need to analyze the problem much more deeply because it sound like the standards you are implicitly applying would result in "economically, everything should be centrally hosted" but that is clearly not the result that obtains. Even a modern mid-grade cell phone is no slouch; you may not be running a current-gen frontier AI on it but you certainly can do a lot of other rather intense things locally that would have been laughable 10 years ago, like suprisingly high powered games.
- TylerE - 35 minutes ago
  
  Because economy of scale isn't really the right metric here. A machine you were you were going to buy anyway essentially has a TCO of $0.
jayd16 - 31 minutes ago

Maybe they just mean from a "it can run a lot of DLSS" perspective.
cyanydeez - an hour ago

128GB seems the sweet spot for local models. I can program and install most GitHub projects with opencode and QWEN 32b with mtp.
anyone whose addicted to token theoughput is losing the operational knowledge and offline capabilities.
if you arent moving to the AMD 395 or MACs then youre hitching aride on the expensive calory ride
- throw1234567891 - 37 minutes ago
  
  If you could buy a 256GB you’d be claiming that 256GB is a sweet spot. But I agree with you. Crack-tokens are not the future.
  - cyanydeez - 3 minutes ago
    
    no, the fact that MACs and x86 and soon ARM are all going to have 128GB models in every sector, yeah, sure.
    But watching everyone flounder because claude goes down or forcing you on API costs.
    I'm programming things that'd take me days with a PC that, without OpenAI's VRAM shenagans, would cost you $2k.
    It's more than just 'this is what I could do' it's definitely about 'this is what anyone could do with a new PC purchase'.
voidfunc - an hour ago

> "Ranked in the top 2% of scientists globally (Stanford/Elsevier 2025) and among GitHub's top 1000 developers"
This made me laugh. I can only image how insufferable this person is to deal with.
unstatusthequo - 40 minutes ago

I hope a family-level AI appliance is a thing later. Local non-cloud assistant that lives in the house, families interact via voice or phones or whatever. Knows the contextual family stuff you need, etc.
- Pxtl - 26 minutes ago
  
  We didn't get people buying family-level file servers for the family photo gallery and documents at any real scale, so i doubt we'll see similar for AI especially when the cost is that much higher for GPUs vs an SBC machine.
sandworm101 - 3 hours ago

Lots of people are already running AI locally. They are the people buying up all the consumer-grade nvidea gpus. What are they doing with them? Well, the same things people with home media or email servers are doing: stuff they dont want to share with the general public.
- Zetaphor - 3 hours ago
  
  I want to reduce my dependency on companies like Google, OpenAI, and Anthropic. Aside from the concerns of data sharing I'm also not a fan of how they run their operations, for example Anthropic now using xAI's Colossus data center which is poisoning a marginalized community, or OpenAI getting in bed with the military.
  Not everything I want to use an LLM for requires "PhD level intelligence", and increasingly I'm finding more uses that involve sharing my personal data.
  Yesterday my local model helped me when looking for a doctor who is in-network for my insurance. I threw it a screenshot from the providers search results and it looked up reviews for all of them.
  - pratnala - 2 hours ago
    
    Which model are you running?
    
    Zetaphor - an hour ago
    
    Qwen 3.6 35B-A3B and 27B both at Q8 on a Strix Halo machine
  - sandworm101 - 2 hours ago
    
    My local AI is currently upscaling an old british comedy from sub-dvd quality to 1k. (It is not availible other than on DVD.) It looks like it will take about a week for my pair of 5060s to chew through the task.
    
    eszed - an hour ago
    
    Which show?
iLoveOncall - 3 hours ago

> "Ranked in the top 2% of scientists globally (Stanford/Elsevier 2025) and among GitHub's top 1000 developers" - side note but this guy puts this everywhere, gives me probably the inverse of what he is marketing for.
Lol yeah seriously, that stinks "I ask AI to generate a huge amount of bullshit and upload it to pad irrelevant stats".
Absolute loser.
- dgacmu - 7 minutes ago
  
  He's not a loser; he's done some really fun work that many people use daily. I've used his range mapping trick in multiple projects/papers. It's elegant.
  It sounds like he's gotten bad advise about how to market himself /or/ this is being marketed to people who have bigger checks to write and whom he believes will be responsive to this kind of marketing. As an academic, it rubs me very wrong - I think it's detrimental to the field when we get into h-index stacking contests or citation count comparisons. But I don't know what incentives he's responding to, which seems important for putting this stuff in context.
  (as an aside, it turns out that polars + fastexcel is about 10x faster than pandas + openpyxl for searching that dataset, if anyone else is curious what he was actually talking about. :)
- nkurz - 2 hours ago
  
  I agree that it sends the wrong symbol, but actually Daniel is great. He cares tremendously about doing work that is actually real-world useful. I've co-written a few papers with him, and he's really hard working and open to outside suggestions. The danger is that if you send him comments, he'll eventually manage to rope you into writing a new and improved version. Seriously, if you are a non-academic computer scientist with a good idea that you want to publish, he'd be incredibly open to working with you.
  As to why he now has this on his blog? I also cringe when I read it. I presume someone told him he should self-promote more, and this is his lame attempt to do so. He's almost certainly the most cited person in his department, but it's entirely possible that none of his colleagues actually know this. Cut him some slack. Self-promotion is not his strength. He's a nerd's nerd, and not a marketer. I'll mention to him that his attempt here might be backfiring when I'm next in contact with him.
  - hgoel - 14 minutes ago
    
    I kind of get it in the sense that every academic has to make themselves somewhat comfortable with self-promotion even if they don't like it. It's an important part of getting funding, but putting a blurb like that everywhere just hurts his credibility I think.
  - infecto - an hour ago
    
    I cringe calling it out but it just stood out as it was plastered everywhere and I actually have never seen his links before.
  - iLoveOncall - 2 hours ago
    
    > As to why he now has this on his blog?
    He doesn't just have it on his blog, he has it EVERYWHERE. Sometimes 2 or 3 times on the same page.
- netsharc - 3 hours ago
  
  I found his website, https://www.lemire.me/en/ , and the "2%" brag is the very first sentence, geez.
  Being the top x% is what OnlyFans girls brag about, professor...
  And it's not exactly brain surgery, is it? https://www.youtube.com/watch?v=THNPmhBl-8I
  - Zetaphor - 3 hours ago
    
    > Daniel Lemire’s blog is one of the top 50 most popular blogs on Hacker News, the standard tech news aggregation site.
    Citation needed
    
    nkurz - 2 hours ago
    
    https://refactoringenglish.com/tools/hn-popularity/
    
    thg - an hour ago
    
    For posterity: It's rank 34 at the time of this comment
- SkiFire13 - an hour ago
  
  That lines looks very cringe indeed, but the guy has some crazy good blogposts on SIMD stuff.
- - 3 hours ago
  
  [deleted]
SwtCyber - 3 hours ago

I think the local-model use case is going to become less niche pretty quickly if the models keep getting smaller and more capable. Even if most people do not care about privacy or offline use, the cost argument is pretty strong

dagmx - 3 hours ago

This feels fluff to me on the part of the author (whose work I don’t want to trivialize) but I don’t think they’ve actually looked deeper than a paper spec sheet on this.

1. Yes it has the same number of cores as a 5070 mobile. It’s also running at a shared peak of 2/3 the bandwidth and a shared peak of 2/3 the TDP. The GPU by itself will likely perform at half the dedicated units performance

2. Apple may not have SVE2 but they do have the AMX (private) and SME. I don’t see why he thinks the SVE2 will give him more performance than the SME.

3. He mentions a single core type but doesn’t mention the total makeup. We already have known for a year how the DGX Spark compares to Apple chips. For CPU it’s roughly equivalent to an M3 Pro and for GPU compute (not rasterization) it’s between an M4 Pro and M4 Max without considering bandwidth.

The real advantage to these is that they run CUDA. That’s it. Otherwise when they launch they’ll be 2-3 generations behind where Apple is and 1 gen behind AMD.

The other super power of the DGX Spark was the NIC for pairing them together. But that’s been removed here too.

storus - 4 minutes ago

> GPU compute (not rasterization) it’s between an M4 Pro and M4 Max without considering bandwidth
You are likely thinking about token generation which is dependent on memory bandwidth where Apple has an edge. Spark's GPU compute is way higher than even M5 Max, around 2x FP32 TFlops... It's literally 6144 CUDA cores like desktop 5070, slowed down by slow memory.
llm_nerd - 41 minutes ago

It is absolutely fluff, and the only reason this worthless tweet is on the front page of HN is that this audience has a habit of canonizing certain people, and treating each of their bowel movements as prophetic.
Guy suddenly became aware of a chip that the rest of the industry long knew about, seems completely unaware of the competitors, and posts about how it's a BEAST and will be a GAME CHANGER.
Like the DGX Spark was a game changer? Eh, it has mostly been a massive disappointment. An overpriced nvidia laptop isn't going to change the equation an iota.

amacbride - 13 minutes ago

It's effectively the same as the GB10 in the DGX Spark (Blackwell architecture, 6,144 CUDA cores, perf-wise comparable to an RTX 5070).

I've found it very useful for running big models, but it's not a screaming powerhouse in terms of raw compute.

adamnemecek - 12 minutes ago

They are early versions, wait 4 years.

PeterStuer - 32 minutes ago

"I am not sure how many people will run AI models locally. It still seems like a niche application to me."

Clip me :). You are currently living through the final stages of unrestricted computing in the hands of the 'public'. Our regimes are going to pull up the drawbridge in the name of 'safety'. Download the open models asap and prepare for an airgapped computing environment. That will be your frontier in not extremely neutered AI in the near future.

I am so hoping I'm completely wrong on this btw.

- 25 minutes ago

[deleted]
shevy-java - 31 minutes ago

They have bought the governments, so what you model will probably become true. They just inflate the prices right now.

GuestFAUniverse - 2 hours ago

And who in 2026 is still anal-fixated on a "Windows" PC?

It's just a personal computer. It normally runs multiple operating systems just fine.

Windows PC sounds like people talking about tech who are either payed by M$, or embed pictures into Word documents to send them.

Nobody has to kill the fun those OS agnostic machine allow, by artificially bind them to a shitty OS.

zdragnar - an hour ago

Enterprise, of course. They probably buy more PCs than the rest of the market combined.
Even for personal use, I'd imagine the amount of people dual booting Windows and something else are a very tiny minority.
Saying "Windows PC" is a pretty reasonable way to distinguish between "made by Apple" and "made by someone else" because the market of PCs that aren't made by Apple and don't come with Windows is really, really tiny.
To be honest, this seems like a strange hill to take such an aggressive stance upon.
jayd16 - 25 minutes ago

A big push specifically for Windows ARM from Nvidia seems like relevant information.
bigyabai - an hour ago

> It's just a personal computer
Your x86 machines were, but these are ARM SOCs. Many of them don't even support UEFI, let alone the upstream Linux kernel.
- rvba - an hour ago
  
  Getting rid of UEFI is bad?
  - bigyabai - an hour ago
    
    Can you quote where I said that?

SwtCyber - 3 hours ago

The interesting part to me isn't really the Cortex-X925 vs AVX-512 comparison, but Nvidia trying to make the GPU the center of a Windows PC rather than an add-in card

- 3 hours ago

[deleted]

dofm - 3 hours ago

Here is the press release for the actual machine:

https://nvidianews.nvidia.com/news/nvidia-microsoft-windows-...

I have been somewhat surprised at the lack of commentators observing that this is Microsoft and above all NVIDIA launching a device that is fundamentally at odds with the metered cloud model of AI.

When you look at the other announcements and murmurings (better offline BYOK for Copilot, talk of an unmetered AI future) I think it’s clear that these two firms understand that cloud-only AI is not sustainable or inherently in their interests. But their willingness to undermine OpenAI with a product like this is notable.

thewebguyd - 11 minutes ago

Yeah, "unmetered intelligence" was probably the most used phrase at MS BUILD this past week. They are going hard into local AI
tantalor - 3 hours ago

Maybe. Or they are simply hedging their bets.

fg137 - 32 minutes ago

I don't think this is going to get any traction in the general consumer world, even less relevant than Apple Vision Pro.

(HN reaction to Vision Pro back in 2024 is almost hilarious if not ridiculous, looking at it today. I knew it would be a flop and I was so right.)

Waterluvian - 3 hours ago

It’s an opportunity for them to start doing away with the whole ATX thing where owners had freedom to mix and match at their own pleasure.

burnt-resistor - an hour ago

They'll ship a welded-shut box that requires an activation key to power on. Users will get to pick color sleeve it uses though.

embedding-shape - an hour ago

> up to 6,144 state-of-the-art CUDA cores

A RTX Pro 6000 has ~24K 5th generation tensor cores, I'm guessing this would then be 1/4 of the count but 6th generation? Wasn't clear from the images.

gravypod - 25 minutes ago

What is more important than core count is how the caching architecture is laid out. They could lay out those 6k cuda cores in a layout which provides much larger blocks of cache to smaller number of cores. That would increase the memory bandwidth which would be better for inference.
- embedding-shape - 16 minutes ago
  
  Sounds like the memory bandwidth is worse though;
  > The memory is not as fast as dedicated GPU memory, but it is cheap enough while delivering enough bandwidth to run AI models locally.
  Also "cheap while delivering enough" certainly sounds like someone is trying to temper expectations. It sounds like something sitting in-between GPU+VRAM inference and CPU+RAM one, not as a step above/besides GPU+VRAM.

seanalltogether - 3 hours ago

Is it really unified memory? AMD Strix Halo is "unified" but you still have to allocate memory separately for cpu vs gpu. Apple Silicon is true unified memory.

flakiness - 3 hours ago

My understanding is that this is the limitation from Windows not from AMD SoC. There are several internet resources to "enable unified memory support" on linux eg [1].
As a side note, qualcomm chip set on Android has been doing this for years (like Apple) so it's not super unique thing. It's more like there was no need before.
[1] https://www.jeffgeerling.com/blog/2025/increasing-vram-alloc...
- kimixa - 3 hours ago
  
  Even then the "reserved" section is a carve out guaranteed chunk to allow stuff that might need contiguous physical memory (display scan out buffers and page tables, for example) and similar.
  The GPU can still happily use all the rest of the memory for other use cases - which tend to be the bulk of allocations anyway. Though there might be performance implications - for example "moving" buffer ownership to the GPU would need to evict CPU caches, and often 4k pages and tlb lookups can be a pretty inefficient situation for GPU-style accesses.
  That's been pretty standard for any SoC for decades. And "differences" to apple's SoC are more implementation details.
Keyframe - 3 hours ago

yes, but more due to OS limitations than hardware. You can use their GTT which is then _true_ UMA where GPU can grab whatever it wants from the memory pool.
This isn't the first time we have UMA on the PC, btw. When SGI did their PC workstations, their 320 and 540 PC workstations had what they called Cobalt graphics chipset and crossbar with their IVC architecture. They bypassed AGP at the time completely. It was quite unique to see strict UMA on a PC. Haven't seen it since until these new systems we're seeing now on PCs and Mac.
eigenspace - 3 hours ago

That's a software question, not a hardware question.
Some software assumes pre-defined set-aside pools of memory reserved for video purposes, but the chip does actually have access to the whole pool.
SwtCyber - 3 hours ago

For local models, the useful part is not just having 128GB attached to the package. It is whether the GPU can practically use that memory without the usual VRAM-style constraints
glitchc - 3 hours ago

Memory bandwidth is what matters, unified or otherwise. Discrete GPUs don't have unified memory either.
ApatheticCosmos - 3 hours ago

Strix halo is unified memory. The memory allocation set in BIOS is overridden by the operating system if it has the capability.
fc417fc802 - 3 hours ago

> you still have to allocate memory separately for cpu vs gpu
That's an API issue not a hardware issue. Regardless, I believe the major APIs permit seamlessly sharing pointers at this point? (I have no experience doing that though.)
joe_mamba - 3 hours ago

>AMD Strix Halo is "unified" but you still have to allocate memory separately for cpu vs gpu.
IIRC that's due to maintain BIOS and Windows (+games & apps) backwards compatibility, but memory access speeds are the same.
ankurdhama - 3 hours ago

It is unified in the sense that the OS can dynamically assign memory to CPU and GPU. Apple silicon is not a alien tech that other silicon vendors cannot implement.

tosh - 3 hours ago

nb: poster is Daniel Lemire (https://lemire.me), who is very skilled in getting performance out of compute hardware (e.g. via simd, cache usage etc)

tempodox - 2 hours ago

Still, Microslop has repeatedly proven their ability to slow everything down to a crawl no matter how powerful the hardware. If you want it to be fast, don’t use Windows.
infecto - 3 hours ago

As he likes to share often, "He ranks among the top 2% of scientists globally (Stanford/Elsevier 2025) and is one of GitHub's top 1000 most followed developers. "
- tosh - 3 hours ago
  
  based on citations and github stars? or what's the context there?
  - infecto - 3 hours ago
    
    I was adding further citation based on his own claims. Not sure what context is missing.

ozgrakkurt - an hour ago

Says running local llms isn’t relevant. Than says it is decent for games, which is just correct if you compare any gpu remotely similarly priced. I don’t understand what is the point he is making

derefr - 25 minutes ago

> The game changer is the unified 128 GB memory. That is the path Apple took years ago. Instead of separate memory for the CPU and GPU, everything shares a single pool. It is increasingly popular.

> The memory is not as fast as dedicated GPU memory, but it is cheap enough while delivering enough bandwidth to run AI models locally.

So, the reason "dedicated GPU memory" is fast, isn't because it's "dedicated"; it's because the types of memory built into GPU cards — GDDR and HBM — are designed for throughput over latency.

Which is to say, GDDR and HBM memory could be shared with the CPU in UMA while still being "fast" (for GPU use-cases.) In fact, the PS4/5 and Xbox 360 / One X / Series consoles have UMA architectures that use GDDR memory as their main memory, with no regular DDR memory to be found.

What I don't understand: why don't we see UMA architectures where there's both regular DDR and GDDR/HBM memory mapped into the address space of the CPU+GPU? That seems like the best of both worlds: you'd have some memory that's "tuned" for random-access CPU usage (regular DDR), and some memory that's "tuned" for streaming GPU usage (GDDR/HBM), but either type of memory can still be put to the use it wasn't "tuned" for, just with slightly-worse performance.

I guess you'd need to do a bit of software work:

1. a bit of work in the OS kernel / malloc library to get CPU workloads to "prefer" allocating DDR memory over the GDDR/HBM memory until they've exhausted DDR memory (or maybe not, if you just tell the kernel the GDDR/HBM memory is something like a zswap thinpool);

2. and a bit of work in supported ML frameworks, to teach them about a hybrid strategy between UMA "allocate anywhere, it's all the same" and NUMA "keep assets in VRAM if possible; if you spill assets to RAM, then they must stream into VRAM on access" (i.e. "at allocation time, allocate as if the system were NUMA, VRAM first then spilling to RAM; but at execution time, use the UMA codepaths, no need to copy RAM into VRAM.")

...but once that's done, it's done.

AmazingTurtle - 3 hours ago

while unified memory may offer better performance than unsoldered DDR system memory, it still won't be as great as 1.8TB/s bandwidth on high end consumer GPUs right now.

nvidias master plan may be making it the new normal to have "only" 400GB/s bandwidth, thus gatekeeping local model usage further behind "more memory but not as fast as the cloud can do it"

dangus - 3 hours ago

I think it’s an interesting theory but a bit too conspiracy theory-ish.
Nvidia just wants to sell stuff to everyone.
And I think for professionals doing local AI work, products like Strix Halo and Apple Silicon are a competitive threat.
A big part of maintaining the leading software ecosystem is ensuring you have competitive hardware for all your users.
I also think the RTX Spark product is relatively low effort for Nvidia. Grab a Mediatek CPU and slap an Nvidia GPU on the die. Sure, that’s oversimplifying it, but still.

emsign - 42 minutes ago

They are useless if RAM prices are this high. $800 laptops with maximum 8GB are currently the norm, Windows 11 can't run on them decently. No matter how fast the SoC is with overpriced RAM they are slow. Systems that can make good use of them with 64-128GB are not affordable anymore thanks to Nvidia and co. This is a smokescreen. They'll probably sell them packaged as compute modules anyway.

alberth - 3 hours ago

Is this essentially an Apple M-Series chip in concept?

shevy-java - 32 minutes ago

And it will be expensive - right?

Nvidia is milking the market now. We need more competition again - currently we have a mafia control the prices, not just Nvidia but all the AI companies. The price increases should be paid for them, not by us. "Free market" is being manipulated by them here.

jmyeet - an hour ago

This is the RTX Spark [1].

The obvious comparison here is the M5 Max where you can buy a Macbook Pro with 128GB of also unified memory. Obviously CUDA cores are specific to NVidia so it's hard to directly compare but I've seen claims that the M5 Max is roughly equivalent to ~4000 CUDA cores. This obviously depends on workload and whether the CPU supports the precision you want to use (eg FP4).

The M5 Max has memory bandwidth of 819GB/s. The RTX Spark I believe is ~600. So it might be slightly better than the current generation of Macs but likely worse than the expected M5 Ultras of the new Mac Studios (likely Q3 2026).

For comparison, a 5090 has >20k CUDA cores and 1800GB/s memory bandwidth with 32GB of VRAM. The RTX 6000 Pro (at ~$10k) has 96GB of VRAM, same bandwidth and ~24k CUDA cores.

We have to see what RTX Spark systems sell for but the DGX Spark is in the Mac Studio price range (~$4k).

I do think Apple has a real opportunity here but there offerings aren't quite there yet. The M5 Ultras might be a really attractive option for local LLMs. I expect them to be in high demand.

[1]: https://news.ycombinator.com/item?id=48352939

bigyabai - an hour ago

> I've seen claims that the M5 Max is roughly equivalent to ~4000 CUDA cores
Who claimed that? The M5 is still a raster focused GPU, dedicated matmul blocks be damned. For some workloads that napkin math might work out, but for many others it's a wild overshoot. Time-to-first-token still favors CUDA, and real-world training workloads aren't getting anywhere near Apple Silicon.
All of the memory bandwidth in the world is useless if you spend 15 minutes processing 64k tokens worth of context prefill. This is where CUDA shines.

YasuoTanaka - 4 hours ago

128GB of unified memory is a dream come true for local LLMs. VRAM has been the ultimate bottleneck for developers.

zamadatix - 3 hours ago

I have a 128 GB LPDDR5X machine. It's a great workstation laptop (which is why I got it) but the memory bandwidth is just awful if you're wanting to use it for AI. An old Epyc CPU will fair better both in terms of being able to run full sized larger models as well as having higher memory bandwidth, and that's not a recommendation to go that route either as it's still not worth it.
adrian_b - 3 hours ago

The competitor for this NVIDIA CPU will not be the now old AMD Strix Halo, but its successor (launched recently), which supports up to 192 GB of unified memory. Thus 128 GB is no longer SOTA.
While this NVIDIA system is inferior from the point of view of the memory capacity, its main advantage is that the top models will have a bigger GPU, i.e. with 6144 or 5120 FP32 execution units, compared to 2560 for the AMD GPU (compared to the NVIDIA CPU, the AMD CPU has a better multi-threaded performance for legacy programs, and a much better multi-threaded performance for the applications that use AVX-512).
However, these top models with big GPUs will also be much more expensive than the competing AMD system, while also being much more expensive than a laptop or mini-PC with an equivalent discrete NVIDIA GPU (which has the disadvantage of having direct access only to a much smaller, even if faster, memory).
- christkv - 3 hours ago
  
  I don’t think there is much improvement in compute for the new strix halo revision. The next one supposedly adds rdna4 cores or similar and more memory channels
avocadoking - 3 hours ago

It could help with exploding external LLM costs. Interesting to see how the adaption will be, which will mainly depend on the price.
SwtCyber - 3 hours ago

This is what makes it interesting to me as well
zackify - 3 hours ago

[dead]

PedroBatista - 3 hours ago

Don't want to be too harsh, maybe I'm missing something, but the CPU is at least 2 years old, internally it has been a complete shitshow and that's a minor hiccup when compared to the firmware and software situation.

It's an interesting "newcomer" and the more the better but calling this a "beast" and a "game changer" is ridiculous to say the least.

Then there is the price..

sherazp995 - an hour ago

Wait a minute!

Nvidia going from GPU to CPU now?

npn - an hour ago

Is this somehow satire? This is just the dgx spark with keyboard and monitor in a convenient format. Since it has more stuff, I'm sure that the price mark up will increase too.

Up to $5000 because why not?

With that money you can build a real PC with rtx 5090!

thewebguyd - 6 minutes ago

Not with 128GB (less OS) available to the GPU you can't. The unified memory is the point with this machine (and the dgx spark).

ChrisArchitect - 3 hours ago

A powerful new chapter for Windows PCs, accelerated by Nvidia RTX Spark

https://news.ycombinator.com/item?id=48352693

Nvidia RTX Spark

https://news.ycombinator.com/item?id=48352939

BoredPositron - 3 hours ago

Mediatek and Nvidia the horsemen of abandoning hardware after a year. The Jetson family still left a bad taste in my mouth.

thewebguyd - 4 minutes ago

Qualcomm is too. They mainlined the GPU firmware for the X Elite 2nd gen, but still have not done so for their 1st gen X Elite which they promised full Linux support for and failed to deliver, and have now moved on pretending they never said that.
burnt-resistor - an hour ago

How dare you question the golden goose egg-laying algorithm for trillions in stock valuation!

llm_nerd - 3 hours ago

Does this person know that this is the same GB chip in the DGX Spark? It isn't some proposed thing, it's a chip loads of people have on their desk right now, and there are endless benchmarks of it.

Decent single core (a long ways from Apple level, but decent), but it makes up for it in cores to provide M5 level performance, CPU wise. Memory bandwidth it is kind of starved, at 1/6th many GPUs.

They got Microsoft to customize Windows for the RTX Spark, and will likely have to brutally throttle it when running as a laptop (it's literally a 140W TDP chip), and that's neat. It's going to be a very expensive laptop.

SwtCyber - 3 hours ago

This is probably the better way to frame it: not "Nvidia is proposing a new CPU system" but "Nvidia is trying to move an existing GB/Spark-class platform into a Windows PC form factor"
Apreche - 3 hours ago

I heard the memory bandwidth is not just slower than on a GPU, as expected, but is significantly slower than Apple’s unified memory.
- MrBuddyCasino - 3 hours ago
  
  CPU/GPU is decent (800 GB or so), memory is slowish (300GB or so). Some Apple M are slower, some are faster.
  - dagmx - 3 hours ago
    
    Where did you get those numbers from?
    DGX Spark has a maximum of 273 GB/s bandwidth in ideal scenarios (hard to reach)
    That puts it between an M5 (153) and M5 Pro (307)
    
    MrBuddyCasino - 32 minutes ago
    
    The 900 GB/s is from the NVLink-C2C interconnect, if you were wondering about that. They quote "up to 900 GB/s of bidirectional bandwidth between GPU and CPU".
    Mind you thats not to/from memory, which indeed only has 273 GB/s.
    
    dagmx - 25 minutes ago
    
    Ah I see. But the only C2C equivalent on the Apple side is the UltraFusion which is 2.5TB/s if I recall correctly.
MrBuddyCasino - 3 hours ago

Plus John Carmack has reviewed it, he was not amazed.

cyberziko - 4 hours ago

good to know, hope the price will be affordable, having a pc becoming a luxury :)

dgellow - 3 hours ago

I’m not sure if you’re aware but there is a supply chain shortage for pretty much everything needed for a PC that isn’t expected to be solved this year or next year. There is no way that can be affordable
crims0n - 3 hours ago

Certainly not in the year of our lord, 2026. Maybe in a few years though.
- 3 hours ago

[deleted]

cryo32 - 3 hours ago

Yeah when laptops are shipping 8Gb and Microsoft is suddenly interested in native apps, nope.

Tech companies have strangled their own market.

thewebguyd - 2 minutes ago

Laptops shipping with less RAM is exactly the reason to be interested in native apps again. Every app being a chrome/EdgeWebView process is the problem.

sometimelurker - 2 hours ago

cant wait til someone figures how to run Linux on one of these

thrance - 3 hours ago

Will it support Linux?

- 3 hours ago

[deleted]

2OEH8eoCRo0 - 3 hours ago

Are their enterprise orders slowing down? Why use precious maxed out fab capacity on consumer stuff when it could be an enterprise chip?

zamadatix - 3 hours ago

It uses LPDDR5X instead of VRAM and will still sell for a premium while pushing their presence even further in every side of the AI market. This was one area AMD was ahead in and now Nvidia is probably better off making this to compete on that front while still being better off than making a 5090.
- fc417fc802 - 3 hours ago
  
  That doesn't answer the question. If the high margin enterprise GPUs are saturating the fab capacity you wouldn't expect them to be pushing this. But IIRC those all have oodles of integrated HBM at this point so I wonder if fab capacity for that has become a bottleneck.
  - zamadatix - an hour ago
    
    I believe it does - the reasons why are exactly differences like LPDDR5X vs HBM3e. Not every fab is capable of making any type of chip another fab makes. If you can make a product with different chips and still sell it for a premium why would you not just because the fabs for your DC product's chips are busy?
    Looking at it more, I believe the story repeats with the TSMC processes used for the CPU vs chips like GB200 as well.
    Even if none of the above were the case, the question still isn't "why not make the enterprise GPU" it's "why not make the higher margin per chip area product". If the NV1/GB10 take less die space and cost a lot it's not immediately apparent the enterprise GPU actually nets Nvidia more $ per die or not. That's why it's relevant these will be sold at a premium.
  - - an hour ago
    
    [deleted]
dofm - 3 hours ago

It already is an enterprise chip. This is about Microsoft not having the equivalent of an M3 Max or whatever laptop.
And maybe for NVIDIA and MS it is also about them quietly betting that local models are, in fact, going to be good enough for most tasks pretty soon.

jqpabc123 - 4 hours ago

I am not sure how many people will run AI models locally. It still seems like a niche application to me.

I'd say this relates directly to the cost of running AI models remotely.

And we won't know what the actual cost will be until AI vendors recover the huge pile of cash they've dumped into development (plus interest).

chpatrick - 3 hours ago

I think it's niche now because getting the hardware to run it is expensive and the quantized models don't work as well. If those improve then it would be a no brainer to pay one off for the hardware instead of a fortune for API calls.
- - 3 minutes ago
  
  [deleted]
- dofm - 3 hours ago
  
  I am not really convinced that four bit quantisation is that bad; almost certainly six will be enough. But Google are making claims for their QAT tech in Gemma that they are surely using or testing in Gemini that it preserves nearly source model quality while reducing footprint.
  The hardware for 50 tokens per second with a four bit quantisation of Gemma 4 26B or the sparse Qwen 3.6 is not really that expensive: it’s a secondhand M1 Max.
  Beyond that, I agree. I think moving planning tasks to local is a now thing, not that it really has much impact on token spend. I also think many small coding tasks are fully within the grasp of the above two models.
  The main issue right now is that the software landscape is rather confusing, but I reckon uncomplicated Gemma 4 26B QAT support with MTP is a few weeks away.
- jqpabc123 - 3 hours ago
  
  AI vendors are attempting to offer the whole apple. And they are spending huge sums of money in the process.
  But most businesses don't really care about most of the apple --- they only need their special bite out of it.
  For example, doctors mainly care about medicine. Nvidia is attempting to provide the hardware needed for local, specialized models.
  - dofm - 3 hours ago
    
    I think it is likely to appeal to video and photo editors who want to use AI tools (the press release has a quote from Blackmagic Design, as well as from Adobe, who I think have no stomach for their own cloud AI).
    But I don’t know about specialised: this could run quite large models with MoE.
dgellow - 3 hours ago

Performances of local models are pretty bad compared to what AI vendors offer, token generation is just too slow to be that useful. And you need to allocate GBs of memories, something that will stay very expensive to buy for a long time.
Running local models will stay niche for a while, unless we see breakthroughs
- jqpabc123 - 3 hours ago
  
  Dumb idea --- how about if we limit local models to specific domains --- medicine for example.
  Most doctors don't care much about engineering or accounting or software development or 10000 other things that big vendor models address.
  This area is yet to be really explored. Nvidia aims to provide the hardware to do so.

theturtle - 2 hours ago

[dead]

sisve - 3 hours ago

> I am not sure how many people will run AI models locally. It still seems like a niche application to me.

Bill Gates had a quote some years ago...

People have still not learned how fast we improve our tech and how much cheaper thing gets I guess :)

dgellow - 3 hours ago

Memory isn’t getting cheap soon, and you need a lot of it for local models
- sisve - 3 hours ago
  
  All depends. The current technology will be cheaper in a year or two. The best cutting edge stuff will properly be even more expensive. But in 10 years time... we can run current SOTA models (or models that are equally good ) on our local hardware
  - dgellow - 3 hours ago
    
    Ah yes, if you count in decades, for sure I expect to run them locally
chaostheory - 3 hours ago

We had a thing called globalism that drastically reduced costs. Globalism right now is on life support. Given geopolitics, I don’t see how it’s going to survive.