Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal Paper • 2508.05988 • Published Aug 8, 2025 • 19
A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression Paper • 2406.11430 • Published Jun 17, 2024 • 25
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10, 2024 • 111