Click any tag below to further narrow down your results
Links
This article explores how ClickHouse, developed by Alexey Milovidov, addresses real-time analytics needs that other databases fail to meet. It highlights the unique features of ClickHouse, such as its speed and simplicity, which have made it a popular choice among AI companies and data-intensive applications.
This article explains how PostgreSQL indexes work and their impact on query performance. It covers the types of indexes available, how data is stored, and the trade-offs in using indexes, including costs related to disk space, write operations, and memory usage.
SQL Arena is a project that provides comparative data on different database vendors to help users choose the right database for their projects. It uses a tool called DBProve to gather performance metrics and offers insights into query execution and database behavior. Contributors can share results and enhance the analysis tools.
This article explains how Floe improves the performance of geo joins by using H3 indexes. Traditional spatial joins can be slow due to their quadratic complexity, but with H3, the process becomes a fast equi-join through a filtering step that reduces the number of candidates. The result is a significant speedup in geospatial queries.
The article explores how to redesign relational databases for modern SSD technology and cloud infrastructure. It discusses key considerations like cache sizing, throughput optimization, and durability, arguing for a shift from single-system to distributed durability. The author emphasizes the need to adapt database designs to leverage advancements in hardware and network capabilities.
This article discusses the challenges package managers face when using Git as a database. It details how various systems like Cargo, Homebrew, and CocoaPods have struggled with performance issues and have ultimately moved away from Git to more efficient methods. The piece highlights the inherent limitations of Git in handling large-scale data management.
The article explores how modern database design should evolve to leverage local SSDs and cloud infrastructure, focusing on performance improvements and durability. It discusses key principles for optimizing database architecture in 2025, including cache sizing, write strategies, and replication methods.
Dolt, the version-controlled SQL database, tested Go's experimental Green Tea garbage collector but found no significant performance improvements. Despite the new collector's intention to enhance cache locality and throughput, real-world tests showed minimal differences in latency and throughput compared to the classic garbage collector. Consequently, Dolt will not enable Green Tea for production builds.
The article discusses the need for new users of large language models (LLMs) to utilize different database systems tailored for their specific requirements. It emphasizes that traditional databases may not suffice for the unique challenges posed by LLMs, necessitating innovative approaches to data storage and retrieval. The author advocates for the exploration of alternative database technologies to enhance performance and efficiency in LLM applications.
The article discusses recent updates in ClickHouse version 1, focusing on the introduction of purpose-built engines designed to optimize performance for specific use cases. These new engines enhance the efficiency of data processing and querying, addressing the diverse needs of analytics workloads.
The article discusses the importance of standardized benchmarks in evaluating database performance, specifically referencing TPC-C. It critiques the tendency of vendors to misrepresent their adherence to established benchmarks, arguing that clear rules and defined criteria are essential for meaningful competition and performance measurement. The author draws parallels between sports and database benchmarks, emphasizing the need for integrity in reporting results.
The article discusses the innovative database system QuinineHM, which operates without a traditional operating system, thereby enhancing performance and efficiency. It highlights the architecture, benefits, and potential use cases of this technology in modern data management.