Click any tag below to further narrow down your results
Links
HGMem is a framework that improves the ability of large language models to tackle sense-making questions by using hypergraph-based memory structures. It adapts dynamically to specific questions, outperforming traditional retrieval-augmented generation (RAG) methods when direct answers aren't available in documents.
The author details their process of building a domain-specific LLM using a 1 billion parameter Llama 3-style model on 8 H100 GPUs. They cover infrastructure setup, memory management, token budget, and optimization techniques like torch.compile to improve training efficiency.
This article critiques the performance of LLM memory systems like Mem0 and Zep, revealing they are significantly less efficient and accurate than traditional methods. The author highlights the architectural flaws that lead to high costs and latency, arguing that these systems are misaligned with their intended use cases.
Agents require effective context management to perform tasks efficiently, which is achieved through context engineering strategies like writing, selecting, compressing, and isolating context. This article explores these strategies, highlighting their importance and how tools like LangGraph support them in managing context for long-running tasks and complex interactions.