Click any tag below to further narrow down your results
Links
This article explains Netflix's Graph Abstraction, which is designed to handle high-throughput operational workloads, achieving nearly 10 million operations per second. It details the architecture, data storage strategies, and caching mechanisms that support real-time graph use cases such as social connections and service topology.
The article discusses the need for new users of large language models (LLMs) to utilize different database systems tailored for their specific requirements. It emphasizes that traditional databases may not suffice for the unique challenges posed by LLMs, necessitating innovative approaches to data storage and retrieval. The author advocates for the exploration of alternative database technologies to enhance performance and efficiency in LLM applications.
The article discusses the impressive log compression capabilities of ClickHouse, showcasing how its innovative algorithms can achieve a compression ratio of up to 170x. It highlights the significance of efficient data storage and retrieval for handling large datasets in analytics. The advancements in compression not only save storage space but also enhance performance for real-time data processing.
Unique indexes in PostgreSQL have a limitation where entries larger than 1/3 of a buffer page (~2.7KB) cannot be indexed, particularly affecting large text fields. To enforce uniqueness on large data, a common workaround is to create a hash of the data and index that instead, allowing for efficient comparisons without exceeding size constraints. The article explains the reasons behind these constraints and offers a practical solution using hash functions.