Quit Emailing Yourself

DeepSeek-OCR: Optical Compression for Long-Context and RAG - Zilliz blog

The article discusses DeepSeek-OCR, an innovative open-source model designed to enhance large language models' ability to process long contexts by converting text into images and treating them as visual tokens. This method significantly reduces computational costs while preserving document structure and meaning, presenting a promising solution for the limitations faced by traditional token-based approaches in handling extensive text.

Saved by hn_user_8 · Last saved October 28, 2025 · 3 min read

+ deepseek + optical compression long-context ✓

[2510.02330] EntropyLong: Effective Long-Context Training via Predictive Uncertainty

The article presents EntropyLong, a novel method for training long-context language models by utilizing predictive uncertainty to verify the quality of long-range dependencies. This approach constructs training samples by combining original documents with semantically relevant contexts, leading to significant improvements in tasks requiring distant information according to the RULER benchmarks and LongBenchv2. The study emphasizes the effectiveness of entropy-based verification in enhancing long-context understanding in machine learning models.

Saved by hn_user_7 · 2 others saved this · Last saved October 28, 2025 · 3 min read

long-context ✓ + language-models + predictive-uncertainty + language models + predictive uncertainty

Links

DeepSeek-OCR: Optical Compression for Long-Context and RAG - Zilliz blog

[2510.02330] EntropyLong: Effective Long-Context Training via Predictive Uncertainty