Lite^3, a JSON-Compatible Zero-Copy Serialization Format

github.com

58 points by cryptonector 6 days ago


cryptonector - 6 days ago

Lite^3 is a clever encoding for JSON data that is indexed as-encoded and is mutable in place.

Perhaps I should have posted this URI instead: https://lite3.io/design_and_limitations.html

Lite^3 deserves to be noticed by HN. u/eliasdejong (the author) posted it 23 days ago but it didn't get very far. I'm hoping this time it gets noticed.

bawolff - an hour ago

This is cool, but the headline makes it sound like the wire format is json compatible which is not the case. I'm not really sure why there is a focus on json here at all - its the least interesting part of this and the same could be said for almost every serialization format.

Jean-Papoulos - 33 minutes ago

This is nice, but please don't clickbait headlines with straight-up lies. This is not JSON-compatible.

al2o3cr - 6 days ago

The docs mention that space for overwritten variable-sized values in the buffer is not reclaimed:

    The overridden space is never recovered, causing buffer size
    to grow indefinitely.
Is the garbage at least zeroed? Otherwise seems like it could "leak" overwritten values when sending whole buffers via memcpy
tarasglek - 2 hours ago

hash collision limitation for keys is the most questionable part of design. Usually thats handled by forcing key lookup to verify that what you looked up matches what you tried to lookup. Resolving this perf hit is probably doable by having an extra table of conflicting hashes

lsb - 3 hours ago

This is super interesting!

Apache Arrow is trying to do something similar, using Flatbuffer to serialize with zero-copy and zero-parse semantics, and an index structure built on top of that.

Would love to see comparisons with Arrow

rixed - 4 hours ago

So it's not really a serialization format, it's a compact, modifiable untyped tree, that one can therefore send to another machine with the same architecture. Or deserialise into native language specific data structures.

Don't get me wrong, I find this type of data structures interesting and useful, but it's misleading to call it "serialization", unless my understanding is wrong.

koolala - 3 hours ago

GLTF is like this too (or PLY)? The main difference is the format of their headers? Just by reading the header you can parse the binary data. I'm surprised BSON and any of the other binary JSON formats they list don't support reading the memory layout in a header.