Fast KV Compaction via Attention Matching

arxiv.org

47 points by cbracketdash 9 hours ago


- 33 minutes ago
[deleted]
cadamsdotcom - 2 hours ago

Superficially it sounds like this could create a bit more of a move toward doing compaction on some continuous basis, or compacting in batches once you hit the context limit, rather than starting fresh with a summary and system prompt..

Feels like high fidelity, fast compaction could be a path to “solving” long context.

speedping - an hour ago

This is big for long-horizon tasks