Quit Emailing Yourself

Engram: How DeepSeek Added a Second Brain to Their LLM

6 min read | Saved February 14, 2026 | Copied!

deepseek 🤖 engram 🤖 language-models 🤖 memory 🤖 n-grams 🤖

Do you care about this?

This article explores DeepSeek's Engram architecture, which improves large language models by using a lookup table for common N-gram patterns instead of relying solely on neural computation. This approach reduces computational load, enhances knowledge retrieval, and allows models to focus on more complex reasoning tasks.

If you do, here's more

DeepSeek's Engram architecture represents a significant shift in how large language models (LLMs) handle memory. Traditional LLMs expend vast computational resources reconstructing familiar patterns from training data. Engram changes this approach by storing common N-gram patterns in a lookup table, allowing the model to retrieve these patterns with O(1) complexity. This method leads to improved performance on knowledge benchmarks, with notable gains in MMLU (+3.0), CMMLU (+4.0), and reasoning tasks like BBH (+5.0). 

Engram's design is rooted in the historical context of N-gram models, which struggled with sparsity. While neural language models generalized patterns, they lost the efficiency of direct lookups. Engram reintroduces this efficiency by using an embedding table that captures frequently accessed N-grams, allowing the model to conserve computational resources for more complex reasoning tasks. It effectively separates factual associations, which benefit from quick retrieval, from reasoning patterns that require deeper computation.

The architecture includes several innovative components. Tokenizer compression reduces the effective vocabulary size, addressing the challenge of storing embeddings for every possible N-gram. Multi-head hashing mitigates hash collisions by using distinct hash functions, ensuring more reliable retrieval. Finally, context-aware gating determines when to utilize memory based on the predictability of the sequence, optimizing the model's performance. Engram not only enhances efficiency but also deepens the network's ability to tackle complex reasoning tasks.

Questions about this article

No questions yet.