Helldivers 2 devs slash install size from 154GB to 23GB
tomshardware.com315 points by doener 7 hours ago
315 points by doener 7 hours ago
> With their latest data measurements specific to the game, the developers have confirmed the small number of players (11% last week) using mechanical hard drives will witness mission load times increase by only a few seconds in worst cases. Additionally, the post reads, “the majority of the loading time in Helldivers 2 is due to level-generation rather than asset loading. This level generation happens in parallel with loading assets from the disk and so is the main determining factor of the loading time.”
It seems bizarre to me that they'd have accepted such a high cost (150GB+ installation size!) without entirely verifying that it was necessary!
I expect it's a story that'll never get told in enough detail to satisfy curiosity, but it certainly seems strange from the outside for this optimisation to be both possible and acceptable.
> It seems bizarre to me that they'd have accepted such a high cost
They’re not the ones bearing the cost. Customers are. And I’d wager very few check the hard disk requirements for a game before buying it. So the effect on their bottom line is negligible while the dev effort to fix it has a cost… so it remains unfixed until someone with pride in their work finally carves out the time to do it.
If they were on the hook for 150GB of cloud storage per player this would have been solved immediately.
The problem they fixed is that they removed a common optimization to get 5x faster loading speeds on HDDs.
That's why they did the performance analysis and referred to their telemetry before pushing the fix. The impact is minimal because their game is already spending an equivalent time doing other loading work, and the 5x I/O slowdown only affects 11% of players (perhaps less now that the game fits on a cheap consumer SSD).
If someone "takes pride in their work" and makes my game load five times longer, I'd rather they go find something else to take pride in.
> The problem they fixed is that they removed a common optimization to get 5x faster loading speeds on HDDs.
Not what happened. They removed an optimization that in *some other games* ,that are not their game, gave 5x speed boost.
And they are changing it now coz it turned out all of that was bogus, the speed boost wasn't as high for loading of data itself, and good part of the loading of the level wasn't even waiting for disk, but terrain generation.
5x space is going to be hard to beat, but one should always be careful about hiding behind a tall tent pole like this. IO isn’t free, it’s cheap. So if they could generate terrain with no data loading it would likely be a little faster. But someone might find a way to speed up generation and then think it’s pointless/not get the credit they deserve because then loading is the tall tent pole.
I’ve worked with far too many people who have done the equivalent in non game software and it leads to unhappy customers and salespeople. I’ve come to think of it as a kind of learned helplessness.
> If someone "takes pride in their work" and makes my game load five times longer, I'd rather they go find something else to take pride in.
And others who wish one single game didn't waste 130GB of their disk space, it's fine to ignore their opinions?
They used up a ton more disk space to apply an ill-advised optimization that didn't have much effect. I don't really understand why you'd consider that a positive thing.
By their own industry data (https://store.steampowered.com/news/app/553850/view/49158394...), deduplication causes a 5x performance increase loading data from HDD. There's a reason so many games are huge, and it's not because they're mining your HDD for HDDCoin.
The "problem" is a feature. The "so it remains unfixed until someone with pride in their work finally carves out the time to do it" mindset suggests that they were simply too lazy to ever run fdupes over their install directory, which is simply not the case. The duplication was intentional, and is still intentional in many other games that could but likely won't apply the same data minimization.
I'll gladly take this update because considerable effort was spent on measuring the impact, but not one of those "everyone around me is so lazy, I'll just be the noble hero to sacrifice my time to deduplicate the game files" updates.
> In the worst cases, a 5x difference was reported between instances that used duplication and those that did not. We were being very conservative and doubled that projection again to account for unknown unknowns.
That makes no goddamn sense. I’ve read it three times and to paraphrase Babbage, I cannot apprehend the confusion of thought that would lead to such a conclusion.
5x gets resources to investigate, not assumed to be correct and then doubled. Orders of magnitude change implementations, as we see here. And it sounds like they just manufactured one out of thin air.
Seems to me that most of these situations have an 80/20 rule and it would be worth someone’s time to figure out what that is.
Get rid of 80% of that duplication for a 2x instead of a 5x slowdown would be something.
I expect better from HN, where most of us are engineers or engineer-adjacent. It's fair to question Arrowhead's priorities but...
too lazy
Really? I think the PC install size probably should have been addressed sooner too, but... which do you think is more likely?Arrowhead is a whole company full of "lazy" developers who just don't like to work very hard?
Or do you think they had their hands full with other optimizations, bug fixes, and a large amount of new content while running a complex multiplatform live service game for millions of players? (Also consider that management was probably deciding priorities there and not the developers)
I put hundreds of hours into HD2 and had a tremendous amount of fun. It's not the product of "lazy" people...
> They used up a ton more disk space to apply an ill-advised optimization that didn't have much effect.
The optimization was not ill-advised. It is in fact, an industry standard and is strongly advised. Their own internal testing revealed that they are one of the supposedly rare cases where this optimization did not have a noticeably positive effect worth the costs.
23 GiB can be cached entirely in RAM on higher end gaming rigs these days. 154 GiB probably does not fit into many player's RAM when you still want something left for the OS and game. Reducing how much needs to be loaded from slow storage is itself an I/O speedup and HDDs are not that bad at seeking that you need to go to extreme lengths to avoid it entirely. The only place where such duplication to ensure linear reads may be warranted is optical media.
They used "industry data" to make performance estimations: https://store.steampowered.com/news/app/553850/view/49158394...
> These loading time projections were based on industry data - comparing the loading times between SSD and HDD users where data duplication was and was not used. In the worst cases, a 5x difference was reported between instances that used duplication and those that did not.
Instead of y'know, running their own game on a hdd.
It's literally "instead of profiling our own app we profiled competition's app and made decisions based on that".
They started off with the competitors data, and then moved on once they had their own data though? Not sure what y'all complaining about.
They made an effort to improve the product, but because everything in tech comes with side effects it turned out to be a bad decision which they rolled back. Sounds like highly professional behavior to me by people doing their best. Not everything will always work out, 100% of the time.
And this might finally reverse the terms of games being >100gb as other teams will be able to point to this decision why they shouldn't implement this particular optimization prematurely
If I’m being charitable, I’m hoping that means the decision was made early in the development process when concrete numbers were not available. However the article linked above kinda says they assumed the problem would be twice as bad as the industry numbers and that’s… that’s not how these things work.
That’s the sort of mistake that leads to announcing a 4x reduction in install size.
>In the worst cases, a 5x difference was reported between instances that used duplication and those that did not.
Never trust a report that highlights the outliers before even discussing the mean. Never trust someone who thinks that is a sane way to use of statistics. At best they are not very sharp, and at worst they are manipulating you.
> We were being very conservative and doubled that projection again to account for unknown unknowns.
Ok, now that's absolutely ridiculous and treating the reader like a complete idiot. "We took the absolute best case scenario reported by something we read somewhere, and doubled it without giving a second thought, because WTF not?. Since this took us 5 seconds to do, we went with that until you started complaining".
Making up completely random numbers on the fly would have made exactly the same amount of sense.
Trying to spin this whole thing into "look at how smart we are that we reverted our own completely brain-dead decision" is the cherry on top.
Are you a working software engineer?
I'm sure that whatever project you're assigned to has a lot of optimization stuff in the backlog that you'd love to work on but haven't had a chance to visit because bugfixes, new features, etc. I'm sure the process at Arrowhead is not much different.
For sure, duplicating those assets on PC installs turned out to be the wrong call.
But install sizes were still pretty reasonable for the first 12+ months or so. I think it was ~40-60GB at launch. Not great but not a huge deal and they had mountains of other stuff to focus on.
I’m a working software developer, and if they prove they cannot do better, I get people who make statements like GP quoted demoted from the decision making process because they aren’t trustworthy and they’re embarrassing the entire team with their lack of critical thinking skills.
When the documented worst case is 5x you prepare for the potential bad news that you will hit 2.5x to 5x in your own code. Not assume it will be 10x and preemptively act, keeping your users from installing three other games.
Well, then I'd like to work where you work. Hard to find shops that take performance seriously. You hiring?
In my experience it's always been quite a battle to spend time on perf.
I'll happily take a demotion if I make a 10x performance goof like that. As long as I can get promoted eventually if I make enough 10x wins.