DeepSeek-V3.2-Exp has been released as an experimental model that incorporates a new sparse attention mechanism aimed at enhancing efficiency in handling long-context text sequences. This version maintains output quality while improving performance across various benchmarks compared to its predecessor, V3.1-Terminus. Detailed instructions for local setup and usage are also provided for the community.
deepseek ✓
sparse-attention ✓
transformer ✓
+ efficiency
benchmarks ✓