Matrix Multiplications on GPUs Run Faster When Given "Predictable" Data (2024)

thonking.ai

95 points by tosh 4 days ago


dan_sbl - 2 hours ago

> For example, when the GPU is fully idle, nvidia-smi tells me that it’s only pulling 88W of power.

I haven't used a non-laptop GPU in some time, but that is a crazy amount of "idle" power consumption. Is this normal for cards like this?

jetsamflotsam - 40 minutes ago

I feel like many of the comments missed the point or didn't read the article. What I believe this article is stating (and I've read this many times during my PhD for various reasons), is that the input data distributions affect how many transistor state changes there are during multiplication. Since these events are a large portion of energy loss/heat generation, the clocks won't be throttled as much for certain data patterns.

There was a workshop paper from SC24 that did more experiments around this I believe. I can't find it now though.

amelius - 2 hours ago

Sounds like a side channel attack waiting to happen.

jayd16 - 2 hours ago

I can't tell from the blog, is this actually verified or is it theory and then numbers showing plausibility?

I could certainly come up with alternative theories about memory compression and prefetching if we were talking about texture reads.

nzach - 3 hours ago

I went in expecting to find 'branch prediction'[0] as the answer, but apparently things are even more complex nowadays.

[0] - https://stackoverflow.com/questions/11227809/why-is-conditio...

- 2 hours ago
[deleted]
gdevenyi - 4 hours ago

People have been noticing the effects of this in local LLM inference. Power limiting seems to improve overall performance!

bitwize - 2 hours ago

It wouldn't surprise me to see some ML algorithm in silico somewhere to select faster matmul paths on favorable data. Yo dawg, I heard you like AI, so we put some AI in your AI so you can infer while you're inferring.

evanjrowley - 2 hours ago

Designing for predictable execution flow is one of the advantages of Tenstorrent hardware.

https://clehaxze.tw/gemlog/2025/04-21-programming-tensotrren...

https://clehaxze.tw/gemlog/2026/01-22-the-real-tenstorrent-t...

https://arxiv.org/html/2604.03279

cold_harbor - 3 hours ago

[dead]