Fast KV Compaction via Attention Matching
arxiv.org47 points by cbracketdash 9 hours ago
47 points by cbracketdash 9 hours ago
Superficially it sounds like this could create a bit more of a move toward doing compaction on some continuous basis, or compacting in batches once you hit the context limit, rather than starting fresh with a summary and system prompt..
Feels like high fidelity, fast compaction could be a path to “solving” long context.
This is big for long-horizon tasks