Improving performance of rav1d video decoder

ohadravid.github.io

288 points by todsacerdoti a day ago


mmastrac - a day ago

The associated issue for comparing two u16s is interesting.

https://github.com/rust-lang/rust/issues/140167

nemothekid - a day ago

Intersting to see this article on the perfromance advantage of not having to zero buffers after this article 2 days ago: https://news.ycombinator.com/item?id=44032680

brookst - a day ago

Title undersells post; it’s actually 2.3% faster with two good optimizations.

robertknight - a day ago

Good post! The inefficient code for comparing pairs of 16-bit integers was an interesting find.

tialaramex - a day ago

All being equal codecs ought to be in WUFFS† rather than Rust, but I can well imagine that it's a much bigger lift to take something as complicated as dav1d and write the analogous WUFFS than to clean up the c2rust translation, if you said a thousand times harder I'd have no trouble believing that. I just think it's worth it for us as a civilisation.

† Or an equivalent special purpose language, but WUFFS is right there

mbeavitt - a day ago

Haha I was just thinking to myself "I wonder if anyone made any progress on that rav1d bounty yet?"

infogulch - a day ago

You know it's a good post when it starts with a funny meme. Seems related to the recent discussion: $20K Bounty Offered for Optimizing Rust Code in Rav1d AV1 Decoder (memorysafety.org) | 108 comments | https://news.ycombinator.com/item?id=43982238

lubesGordi - a day ago

Honestly its a little surprising the first optimization he found was something fairly obvious just by using perf. I thought they had discussed the zeroing buffers issue in the first post? The second optimization was definitely more involved/interesting but was still pointed at by perf. Don't underestimate that tool!

saagarjha - 10 hours ago

I am very curious what you did to embed the profiler results into your blog post. Literally copy the HTML nodes?

renewiltord - a day ago

Oh this stuff is what’s prompting the ffmpeg Twitter account to make a stand against Rust https://x.com/ffmpeg/status/1924137645988356437?s=46

smallpipe - a day ago

This is really fun. Is there anything stopping rustc from performing the transmute trick ?

Edit: If I had read the next paragraph, I'd have learn about [1] before commenting

[1] https://github.com/rust-lang/rust/issues/140167

Mr_Eri_Atlov - a day ago

AV1 continues to be the most fascinating development in media encoding.

AVG-SVT-PSY is particularly interesting to read up on as well.

jebarker - a day ago

Beautiful work and nice write-up. Profiling and optimization is absolutely my favorite part of software development.

anon-3988 - a day ago

Is skipping initialization of buffers a hard problem for compilers?

sylware - 6 hours ago

I don't understand this project. dav1d is 99% assembly (x86_64/risc-v 64bits/etc) with very little simple and plain C as coordinating code.

mdf - a day ago

There's something about real optimization stories that I find fascinating – particularly the detailed ones including step-by-step improvements and profiling to show how numbers got better. In some way, they are satisfying to read.

Nicholas Nethercote's "How to speed up the Rust compiler" writings[1] fall into this same category for me.

Any others?

[1] https://nnethercote.github.io/

IgorPartola - a day ago

AV1 is an amazing codec. I really hope it replaces proprietary codecs like h264 and h265. It has a similar, if not better, performance to h265 while being completely free. Currently on an Intel-based Macbook it is only supported in some browsers, however it seems that newer video cards from AMD, Nvidia, and Intel do include hardware decoders.

gitroom - a day ago

[dead]

matanx69 - a day ago

[dead]

josteingutt - a day ago

[flagged]

TinkersW - a day ago

Interesting, but mostly just sounds like Rust issues, and requiring some nonsense to fix issues that shouldn't have existed in the first place.

Leading me to the conclusion that Rust is a dubious choice for highly optimized SIMD code.