Gzip decompression in 250 lines of Rust

iev.ee

86 points by vismit2000 4 days ago


stgn - 5 hours ago

> so i wrote a gzip decompressor from scratch

After skimming through the author's Rust code, it appears to be a fairly straightforward port of puff.c (included in the zlib source): https://github.com/madler/zlib/blob/develop/contrib/puff/puf...

nayuki - 5 hours ago

Just like that author, many years ago, I went through the process of understanding the DEFLATE compression standard and producing a short and concise decompressor for gzip+DEFLATE. Here are the resources I published as a result of that exploration:

* https://www.nayuki.io/page/deflate-specification-v1-3-html

* https://www.nayuki.io/page/simple-deflate-decompressor

* https://github.com/nayuki/Simple-DEFLATE-decompressor

Lerc - 4 hours ago

The function

  fn bits(&mut self, need: i32) -> i32 { ....
Put me in mind of one of my early experiments in Rust. It would be interesting to compare a iterator based form that just called .take(need)

I haven't written a lot of Rust, but one thing I did was to write an iterator that took an iterator of bytes as input and provided bits as output. Then used an iterator that gave bytes from a block of memory.

It was mostly as a test to see how much high level abstraction left an imprint on the compiled code.

The dissasembly showed it pulling in 32 bits at a time and shifting out the bits pretty much the same way I would have written in ASM.

I was quite impressed. Although I tested it was working by counting the bits and someone critizised it for not using popcount, so I guess you can't have everything.

carlos256 - 3 hours ago

>the only flag we care about is FNAME The specification does not define an encoding for the file name. Different file systems may impose restrictions on certain names, so FNAME should not be used.

MisterTea - 5 hours ago

> twenty five thousand lines of pure C not counting CMake files. ...

Keep in mind this is also 31 years of cruft and lord knows what.

Plan 9 gzip is 738 lines total:

  gzip.c 217 lines
  gzip.h 40 lines
  zip.c  398 lines
  zip.h  83 lines
Even the zipfs file server that mounts zip files as file systems is 391 lines.

edit - post a link to said code: https://github.com/9front/9front/tree/front/sys/src/cmd/gzip

> ... (and whenever working with C always keep in mind that C stands for CVE).

Sigh.

up2isomorphism - 5 hours ago

Another dev who doesn’t show respect to what has been done and expect a particular language will do wonders for him. Also I don’t see this is much better in term of readability.

jeffrallen - 5 hours ago

But probably without any error checking.

Feels like Rust culture inherited "throw and forget" as an error handling "strategy" from Java

Sigh.