Nvidia Stock Crash Prediction
entropicthoughts.com339 points by todsacerdoti 9 hours ago
339 points by todsacerdoti 9 hours ago
This article goes more into the technical analysis of the stock rather than the underlying business fundamentals that would lead to a stock dump.
My 30k ft view is that the stock will inevitably slide as AI datacenter spending goes down. Right now Nvidia is flying high because datacenters are breaking ground everywhere but eventually that will come to an end as the supply of compute goes up.
The counterargument to this is that the "economic lifespan" of an Nvidia GPU is 1-3 years depending on where it's used so there's a case to be made that Nvidia will always have customers coming back for the latest and greatest chips. The problem I have with this argument is that it's simply unsustainable to be spending that much every 2-3 years and we're already seeing this as Google and others are extending their depreciation of GPU's to something like 5-7 years.
I hear your argument, but short of major algorithmic breakthroughs I am not convinced the global demand for GPUs will drop any time soon. Of course I could easily be wrong, but regardless I think the most predictable cause for a drop in the NVIDIA price would be that the CHIPS act/recent decisions by the CCP leads a Chinese firm to bring to market a CUDA compatible and reliable GPU at a fraction of the cost. It should be remembered that NVIDIA's /current/ value is based on their being locked out of their second largest market (China) with no investor expectation of that changing in the future. Given the current geopolitical landscape, in the hypothetical case where a Chinese firm markets such a chip we should expect that US firms would be prohibited from purchasing them, while it's less clear that Europeans or Saudis would be. Even so, if NVIDIA were not to lower their prices at all, US firms would be at a tremendous cost disadvantage while their competitors would no longer have one with respect to compute.
All hypothetical, of course, but to me that's the most convincing bear case I've heard for NVIDIA.
Doesn't even necessarily need to be CUDA compatible... there's OpenCL and Vulkan as well, and likely China will throw enough resources at the problem to bring various libraries into closer alignment to ease of use/development.
I do think China is still 3-5 years from being really competitive, but still even if they hit 40-50% of NVidia, depending on pricing and energy costs, it could still make significant inroads with legal pressure/bans, etc.
People will want more GPUs but will they be able to fund them? At what points does the venture capital and loans run out? People will not keep pouring hundreds of billions into this if the returns don't start coming.
Not that locked out: https://www.cnbc.com/2025/12/31/160-million-export-controlle...
> short of major algorithmic breakthroughs I am not convinced the global demand for GPUs will drop any time soon
Or, you know, when LLMs don't pay off.
Even if LLMs didn't advance at all from this point onward, there's still loads of productive work that could be optimized / fully automated by them, at no worse output quality than the low-skilled humans we're currently throwing at that work.
inference requires a fraction of the power that training does. According to the Villalobos paper, the median date is 2028. At some point we won't be training bigger and bigger models every month. We will run out of additional material to train on, things will continue commodifying, and then the amount of training happening will significantly decrease unless new avenues open for new types of models. But our current LLMs are much more compute-intensive than any other type of generative or task-specific model
> We will run out of additional material to train on
This sounds a bit silly. More training will generally result in better modeling, even for a fixed amount of genuine original data. At current model sizes, it's essentially impossible to overfit to the training data so there's no reason why we should just "stop".
You'd be surprised how quickly improvement of autoregressive language models levels off with epoch count. Diffusion language models otoh indeed keep profiting for much longer, so in that case an architecture switch would be in order first.
I'm just talking about text generated by human beings. You can keep retraining with more parameters on the same corpus
Inference leans heavily on GPU RAM and RAM bandwidth for the decode phase where an increasingly greater amount of time is being spent as people find better ways to leverage inference. So NVIDIA users are currently arguably going to demand a different product mix when the market shifts away from the current training-friendly products. I suspect there will be more than enough demand for inference that whatever power we release from a relative slackening of training demand will be more than made up and then some by power demand to drive a large inference market.
It isn’t the panacea some make it out to be, but there is obvious utility here to sell. The real argument is shifting towards the pricing.
How much of the current usage is productive work that's worth paying for vs personal usage / spam that would just drop off after usage charges come in? I imagine flooding youtube and instagram with slop videos would reduce if users had to pay fair prices to use the models.
The companies might also downgrade the quality of the models to make it more viable to provide as an ad supported service which would again reduce utilisation.
For any "click here and type into a box" job for which you'd hire a low-skilled worker and give them an SOP to follow, you can have an LLM-ish tool do it.
And probably for the slightly more skilled email jobs that have infiltrated nearly all companies too.
Is that productive work? Well if people are getting paid, often a multiple of minimum wage, then it's productive-seeming enough.
Exactly, the current spend on LLMs is based on extremely high expectations and the vendors operating at a loss. It’s very reasonable to assume that those expectations will not be met, and spending will slow down as well.
Nvidia’s valuation is based on the current trend continuing and even increasing, which I consider unlikely in the long term.
> Nvidia’s valuation is based on the current trend continuing
People said this back when Folding@Home was dominated by Team Green years ago. Then again when GPUs sold out for the cryptocurrency boom, and now again that Nvidia is addressing the LLM demand.
Nvidia's valuation is backstopped by the fact that Russia, Ukraine, China and the United States are all tripping over themselves for the chance to deploy it operationally. If the world goes to war (which is an unfortunate likelihood) then Nvidia will be the only trillion-dollar defense empire since the DoD's Last Supper.
China is restricting purchases of H200s. The strong likelihood is that they're doing this to promote their own domestic competitors. It may take a few years for those chips to catch up and enter full production, but it's hard to envision any "trillion dollar" Nvidia defense empire once that happens.
It's very easy to envision. America needs chips, and Intel can't do most of this stuff.
Intel makes GPUs.
Intel's GPU designs make AMD look world-class by comparison. Outside of transcode applications, those Arc cards aren't putting up a fight.
They already are paying off. The nature of LLMs means that they will require expensive, fast hardware that's a large capex.
They aren’t yet because the big providers that paid for all of this GPU capacity aren’t profitable yet.
They continually leap frog each other and shift around customers which indicates that the current capacity is already higher than what is required for what people actually pay for.
Google, Amazon, and Microsoft aren’t profitable?
I assume the reference was AI use cases are not profitable. Those companies are subsidizing and OpenAI/grok are burning money.
Yeah but OpenAI is adding ads this year for the free versions, which I'm guessing is most of their users. They are probably hedging on taking a big slice of Google's advertising monopoly-pie (which is why Google is also now all-in on forcing Gemini opt-out on every product they own, they can see the writing on the wall).
Google, Amazon, and Microsoft do a lot of things that aren't profitable in themselves. There is no reason to believe a company will kill a product line just because it makes a loss. There are plenty of other reasons to keep it running.
Do you think it's odd you only listed companies with already existing revenue streams and not companies that started with and only have generative algos as their product?
Aren't all Microsoft products OpenAI based? OpenAI has always been burning money.
How many business units have Google and Microsoft shut down or ceased investment for being unprofitable?
I hear Meta is having massive VR division layoffs…who could have predicted?
Raw popularity does not guarantee sustainability. See: Vine, WeWork, MoviePass.
NVIDIA stock tanked in 2025 when people learned that Google used TPUs to train Gemini, which everyone in the community knows since at least 2021. So I think it's very likely that NVIDIA stock could crash for non-rationale reasons
edit: 2025* not 2024
It also tanked to ~$90 when Trump announced tariffs on all goods for Taiwan except semiconductors.
I don't know if that's non-rational, or if people can't be expected to read the second sentence of an announcement before panicking.
The market is full of people trying to anticipate how other people are going to react and exploit that by getting there first. There's a layer aimed at forecasting what that layer is going to do as well.
It's guesswork all the way down.
Personally, I try to predict how others are going to predict that yet others will react.
This was also on top of claims (Jan 2025) that Deepseek showed that "we don't actually need as much GPU, thus NVidia is less needed"; at least it was my impression this was one of the (now silly-seeming) reasons NVDA dropped then.
> I don't know if that's non-rational, or if people can't be expected to read the second sentence of an announcement before panicking.
These days you have AI bots doing sentiment based training.
If you ask me... all these excesses are a clear sign for one thing, we need to drastically rein in the stonk markets. The markets should serve us, not the other way around.
Google did not use TPUs for literally every bit of compute that led to Gemini. GCP has millions of high end Nvidia GPUs and programming for them is an order of magnitude easier, even for googlers.
Any claim from google that all of Gemini (including previous experiments) was trained entirely by TPUs is lies. What they are truthfully saying is that the final training run was done on all TPUs. The market shouldn’t react heavily to this, but instead should react positively to the fact that google is now finally selling TPUs externally and their fab yields are better than expected.
> including all previous experiments
How far back do you go? What about experiments into architecture features that didn’t make the cut? What about pre-transformer attention?
But more generally, why are you so sure that they team that built Gemini didn’t exclusively use TPUs while they were developing it?
I think that one of the reasons that Gemini caught up so quickly is because they have so much compute at fraction of the price of everyone else.
Why should it not react heavily? What’s stopping this from being a start of a trend for google and even Amazon?
I really don't understand the argument that nvidia GPUs only work for 1-3 years. I am currently using A100s and H100s every day. Those aren't exactly new anymore.
It’s not that they don’t work. It’s how businesses handle hardware.
I worked at a few data centers on and off in my career. I got lots of hardware for free or on the cheap simply because the hardware was considered “EOL” after about 3 years, often when support contracts with the vendor ends.
There are a few things to consider.
Hardware that ages produce more errors, and those errors cost, one way or another.
Rack space is limited. A perfectly fine machine that consumes 2x the power for half the output cost. It’s cheaper to upgrade a perfectly fine working system simply because it performs better per watt in the same space.
Lastly. There are tax implications in buying new hardware that can often favor replacement.
I’ll be so happy to buy a EOL H100!
But no, there’s none to be found, it is a 4 year, two generations old machine at this point and you can’t buy one used at a rate cheaper than new.
Well demand is so high currently that it's likely this cycle doesn't exist yet for fast cards.
For servers I've seen where the slightly used equipment is sold in bulk to a bidder and they may have a single large client buy all of it.
Then around the time the second cycle comes around it's split up in lots and a bunch ends up at places like ebay
Yea looking at 60 day moving average on computeprices.com H100 have actually gone UP in cost recently, at least to rent.
A lot of demand out there for sure.
There’s plenty on eBay? But at the end of your comment you say “a rate cheaper than new” so maybe you mean you’d love to buy a discounted one. But they do seem to be available used.
> so maybe you mean you’d love to buy a discounted one
Yes. I'd expect 4 year old hardware used constantly in a datacenter to cost less than when it was new!
(And just in case you did not look carefully, most of the ebay listings are scams. The actual product pictured in those are A100 workstation GPUs.)
Not sure why this "GPUs obsolete after 3 years" gets thrown around all the time. Sounds completely nonsensical.
Especially since AWS still have p4 instances that are 6 years old A100s. Clearly even for hyperscalers these have a useful life longer than 3 years.
I agree that there is hyperbole thrown around a lot here and its possible to still use some hardware for a long time or to sell it and recover some cost but my experience in planning compute at large companies is that spending money on hardware and upgrading can often result in saving money long term.
Even assuming your compute demands stay fixed, its possible that a future generation of accelerator will be sufficiently more power/cooling efficient for your workload that it is a positive return on investment to upgrade, more so when you take into account you can start depreciating them again.
If your compute demands aren't fixed you have to work around limited floor space/electricity/cooling capacity/network capacity/backup generators/etc and so moving to the next generation is required to meet demand without extremely expensive (and often slow) infrastructure projects.
Sure, but I don't think most people here are objecting to the obvious "3 years is enough for enterprise GPUs to become totally obsolete for cutting-edge workloads" point. They're just objecting to the rather bizarre notion that the hardware itself might physically break in that timeframe. Now, it would be one thing if that notion was supported by actual reliability studies drawn from that same environment - like we see for the Backblaze HDD lifecycle analyses. But instead we're just getting these weird rumors.
It's because they run 24/7 in a challenging environment. They will start dying at some point and if you aren't replacing them you will have a big problem when they all die en masse at the same time.
These things are like cars, they don't last forever and break down with usage. Yes, they can last 7 years in your home computer when you run it 1% of the time. They won't last that long in a data center where they are running 90% of the time.
A makeshift cryptomining rig is absolutely a "challenging environment" and most GPUs by far that went through that are just fine. The idea that the hardware might just die after 3 years' usage is bonkers.
Crypto miners undervote for efficiency GPUs and in general crypto mining is extremely light weight on GPUs compared to AI training or inference at scale
With good enough cooling they can run indefinitely!!!!! The vast majority of failures are either at the beginning due to defects or at the end due to cooling! It’s like the idea that no moving parts (except the HVAC) is somehow unreliable is coming out of thin air!
Do you know how support contract lengths are determined? Seems like a path to force hardware refreshes with boilerplate failure data carried over from who knows when.
The common factoid raised in financial reports is GPUs used in model training will lose thermal insulation due to their high utilization. The GPUs ostensibly fail. I have heard anecdotal reports of GPUs used for cryptocurrency mining having similar wear patterns.
I have not seen hard data, so this could be an oft-repeated, but false fact.
It's the opposite actually - most GPU used for mining are run at a consistent temp and load which is good for long term wear. Peaky loads where the GPU goes from cold to hot and back leads to more degradation because of changes in thermal expansion. This has been known for some time now.
That is commonly repeated idea, but it doesn't take into account countless token farms which are smaller than a datacenter. Basically anything from a single MB with 8 cards to a small shed with rigs, all of which tend to disregard common engineering practices and run hardware into a ground to maximize output until next police raid or difficulty bump. Plenty of photos in the internet of crappy rigs like that, and no one guarantees which GPU comes whom where.
Another commonly forgotten issue is that many electrical components are rated by hours of operation. And cheaper boards tend to have components with smaller tolerances. And that rated time is actually a graph, where hour decrease with higher temperature. There were instances of batches of cards failing due to failing MOSFETs for example.
While I'm sure there are small amateur setups done poorly that push cards to their limits this seems like a more rare and inefficient use. GPUS (even used) are expensive and running them at maximum would require large costs and time to be replacing them regularly. Not to mention the increased cost of cooling and power.
Not sure I understand the police raid mentality - why are the police raiding amateur crypto mining setups ?
I can totally see cards used by casual amateurs being very worn / used though - especially your example of single mobo miners who were likely also using the card for gaming and other tasks.
I would imagine that anyone purposely running hardware into the ground would be running cheaper / more efficient ASICS vs expensive Nvidia GPUs since they are much easier and cheaper to replace. I would still be surprised however if most were not proritising temps and cooling
Let's also not forget the set of miners that either overclock or dont really care about long term in how they set up thermals
Miners usually don't overclock though. If anything underclocking is the best way to improve your ROI because it significantly reduces the power consumption while retaining most of the hashrate.
Exactly - more specifically undervolting. You want the minimum volts going to the card with it still performing decently.
Even in amateur setups the amount of power used is a huge factor (because of the huge draw from the cards themselves and AC units to cool the room) so minimising heat is key.
From what I remember most cards (even CPUs as well) hit peak efficiency when undervolted and hitting somewhere around 70-80% max load (this also depends on cooling setup). First thing to wear out would probably be the fan / cooler itself (repasting occasionally would of course help with this as thermal paste dries out with both time and heat)
The only amatures I know doing this are trying to heat their garrage for free. so long as the heat gain is paid for they can afford to heat an otherwise unheated building.
Wouldn't the exact same considerations apply to AI training/inference shops, seeing as gigawatts are usually the key constraint?
> I have heard anecdotal reports of GPUs used for cryptocurrency mining having similar wear patterns.
If this was anywhere close to a common failure mode, I'm pretty sure we'd know that already given how crypto mining GPUs were usually ran to the max in makeshift settings with woefully inadequate cooling and environmental control. The overwhelming anecdotal evidence from people who have bought them is that even a "worn" crypto GPU is absolutely fine.
I can't confirm that fact - but it's important to acknowledge that consumer usage is very different from the high continuous utilization in mining and training. It is credulous that the wear on cards under such extreme usage is as high as reported considering that consumers may use their cards at peak 5% of waking hours and the wear drop off is only about 3x if it is used near 100% - that is a believable scale for endurance loss.
1-3 is too short but they aren’t making new A100s, theres 8 in a server and when one goes bad what do you do? you wont be able to renew a support contract. if you wanna diy you eventually you have to start consolidating pick and pulls. maybe the vendors will buy them back from people who want to upgrade and resell them. this is the issue we are seeing with A100s and we are trying to see what our vendor will offer for support.
They're no longer energy competitive. I.e. the amount of power per compute exceeds what is available now.
It's like if your taxi company bought taxis that were more fuel efficient every year.
Margins are typically not so razor thin that you cannot operate with technology from one generation ago. 15 vs 17 mpg is going to add up over time, but for a taxi company it's probably not a lethal situation to be in.
At least with crypto mining this was the case. Hardware from 6 months ago is useless ewaste because the new generation is more power efficient. All depends on how expensive the hardware is vs the cost of power.
Tell that to the airline industry
And yet they aren't running planes and engines all from 2023 or beyond: See the MD-11 that crashed in Louisville: Nobody has made a new MD-11 in over 20 years. Planes move to less competitive routes, change carriers, and eventually might even stop carrying people and switch to cargo, but the plane itself doesn't get to have zero value when the new one comes out. An airline will want to replace their planes, but a new plane isn't fully amortized in a year or three: It still has value for quite a while
I don't think the airline industry is a great example from an IT perspective, but I agree with regard to the aircraft.
If a taxi company did that every year, they'd be losing a lot of money. Of course new cars and cards are cheaper to operate than old ones, but is that difference enough to offset buying a new one every one to three years?
>If a taxi company did that every year, they'd be losing a lot of money. Of course new cars and cards are cheaper to operate than old ones, but is that difference enough to offset buying a new one every one to three years?
That's where the analogy breaks. There are massive efficiency gains from new process nodes, which new GPUs use. Efficiency improvements for cars are glacial, aside from "breakthroughs" like hybrid/EV cars.
>offset buying a new one every one to three years?
Isn't that precisely how leasing works? Also, don't companies prefer not to own hardware for tax purposes? I've worked for several places where they leased compute equipment with upgrades coming at the end of each lease.
Who wants to buy GPUs that were redlined for three years in a data center? Maybe there's a market for those, but most people already seem wary of lightly used GPUs from other consumers, let alone GPUs that were burning in a crypto farm or AI data center for years.
> Who wants to buy
who cares? that's the beauty of the lease. once it's over, the old and busted gets replaced with new and shiny. what the leasing company does is up to them. it becomes one of those YP not an MP situations with deprecated equipment.
The leasing company cares - the lease terms depend on the answer. That is why I can lease a car for 3 years for the same payment as a 6 year loan (more or less) - the lease company expects someone will want it. If there is no market for it after they will still lease it but the cost goes up
Depends on the price, of course. I'm wary of paying 50% of new for something run hard 3 years. Seems an NVIDIA H100 is going for $20k+ on EBay. I'm not taking that risk.
That works either because someone wants to buy old hardware for the manufacturer/lessor, or because the hardware is EOL in 3 years but it's easier to let the lessor deal with recyling / valuable parts recovery.
If your competitor refreshes their cards and you dont, they will win on margin.
You kind of have to.
Not necessarily if you count capital costs vs operating costs/margins.
Replacing cars every 3 years vs a couple % in efficiency is not an obvious trade off. Especially if you can do it in 5 years instead of 3.
You highlight the exact dilemma.
Company A has taxis that are 5 percent less efficient and for the reasons you stated doesn't want to upgrade.
Company B just bought new taxis, and they are undercutting company A by 5 percent while paying their drivers the same.
Company A is no longer competitive.
The debt company B took on to buy those new taxis means they're no longer competitive either if they undercut by 5%.
The scenario doesn't add up.
But Company A also took on debt for theirs, so that's a wash. You assume only one of them has debt to service?