Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper

Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper

Introduces a low-rank-based approach to KV cache compression, one of the key bottlenecks in long-context AI Speeds up attention computation by up to 6.9x and overall generation throughput by up to 3.1x, moving beyond memory savings to faster...

menu
menu