Click any tag below to further narrow down your results
Links
This article examines five methods for inserting data into PostgreSQL using Python, focusing on the trade-offs between performance, safety, and convenience. It highlights when to prioritize speed and when clarity is more important, helping you select the best tool for your specific data requirements.
This article explains the impact of excessive indexes on Postgres performance, detailing how they slow down writes and reads, waste disk space, and increase maintenance overhead. It emphasizes the importance of regularly dropping unused and redundant indexes to optimize database efficiency.
This article explains how to use PostgreSQL's templating system to create fast, zero-copy database clones. It covers the new cloning strategies introduced in PostgreSQL 15 and 18, detailing the efficiency of using modern filesystems for cloning without additional storage costs.
This article explores creative database optimization techniques in PostgreSQL, focusing on scenarios that bypass full table scans and reduce index size. It emphasizes using check constraints and function-based indexing to improve query performance without unnecessary overhead.
The article reviews key trends in databases from 2025, highlighting PostgreSQL's continued dominance and significant developments like the rise of distributed PostgreSQL projects. It discusses major acquisitions, new services from tech giants, and the adoption of the Model Context Protocol for better integration with language models.
This article discusses a system built for Wayfair that uses PostgreSQL as a Dead Letter Queue (DLQ) to manage failed event processing. Instead of using Kafka for failed events, the system stores them in a PostgreSQL table, allowing for better visibility and easier reprocessing. It also outlines a retry mechanism with exponential backoff to prevent flooding the DLQ with transient failures.
This article details how OpenAI scaled PostgreSQL to handle the massive traffic from 800 million ChatGPT users. It discusses the challenges faced during high write loads, optimizations made to reduce strain on the primary database, and strategies for maintaining performance under heavy demand.
PostgreSQL 19 introduces a significant optimization for data aggregation, allowing the database to aggregate data before performing joins. This change can greatly enhance performance without requiring any alterations to existing code. However, some complex features, like `GROUP BY CUBE`, may not fully benefit from this improvement.
This article explains how PostgreSQL indexes work and their impact on query performance. It covers the types of indexes available, how data is stored, and the trade-offs in using indexes, including costs related to disk space, write operations, and memory usage.
This article details Datadog's approach to creating a managed data replication platform that improves data movement across services. It covers technical challenges faced with a shared Postgres database, the transition to a dedicated search platform, and the use of asynchronous replication to enhance scalability and reliability.
This article details how VectorChord reduced the time to index 100 million vectors in PostgreSQL from 40 hours to just 20 minutes while cutting memory usage by seven times. It outlines specific optimizations in the clustering, insertion, and compaction phases that made this significant improvement possible.
The article introduces pg_clickhouse, a PostgreSQL extension that allows users to run analytics queries on ClickHouse without modifying their existing PostgreSQL queries. It aims to streamline the migration process for organizations moving from PostgreSQL to ClickHouse, addressing challenges like query rewriting and execution speed.
This article explains how to enhance traditional Retrieval-Augmented Generation (RAG) pipelines by implementing an agentic RAG system. It uses PostgreSQL for data storage and n8n for orchestration, allowing AI to dynamically select tools based on user queries, improving information retrieval accuracy.
Aiven has launched a Developer tier for its PostgreSQL service, starting at $5 per month. This tier offers more storage and keeps services running even when inactive, along with Basic support, making it suitable for test and personal projects.
This article explains how PostgreSQL manages data recovery through its Write-Ahead Logging (WAL) system. It covers the recovery lifecycle, including crash recovery, point-in-time recovery, and the role of WAL in maintaining data integrity during these processes.
pgFirstAid is an open-source PostgreSQL function that identifies and prioritizes database health issues, offering actionable recommendations. It helps users, regardless of technical expertise, improve database stability and performance quickly.
This article explains the new skip scan feature in PostgreSQL 18, which improves query performance by allowing the database to bypass unnecessary index entries. It details the setup process, how btree indexes work, and provides examples showing significant performance gains.
This article explains the complexities of using arrays in PostgreSQL beyond the basics. It highlights the trade-offs between using arrays and traditional relational database practices, including issues with referential integrity and indexing. The author discusses best practices and common pitfalls when working with arrays.
Crunchy Hardened PostgreSQL support might end around April 2026, prompting organizations in regulated sectors to assess their options. The article highlights the risks of relying on vendor-controlled distributions and suggests Percona as a stable, open-source alternative.
PostgreSQL has launched pg_ai_query, an extension that generates SQL queries from natural language and analyzes query performance. It offers index recommendations and schema-aware intelligence to streamline SQL development. The extension is compatible with PostgreSQL versions 14 and above.
This article explains how Datadog's Database Monitoring now supports automatic collection of PostgreSQL's EXPLAIN ANALYZE plans. It helps users identify performance issues in queries by correlating execution details with application performance monitoring (APM) data. The tool also visualizes data to simplify the analysis of slow queries.
The article introduces pgX, a tool designed to integrate PostgreSQL monitoring with application and infrastructure observability. It emphasizes the need for a unified approach to diagnose performance issues effectively, moving away from isolated database metrics. This shift helps engineers understand the system's behavior as a whole, improving troubleshooting and optimization efforts.
Aiven has released PostgreSQL 18, which features significant performance improvements and new functionalities like asynchronous I/O, enhanced JOIN and GROUP BY operations, and parallel GIN index creation. This version allows more flexibility in schema evolution and smarter indexing with skip scans. Users can try PostgreSQL 18 with a free trial at Aiven.
PostgreSQL 18 introduces significant improvements to the RETURNING clause, particularly with the addition of OLD and NEW aliases. This allows developers to easily access both previous and current data states during DML operations, streamlining data tracking and simplifying application logic.
This article explains checkpointing in message processing, using a gaming analogy to illustrate how it allows for recovering from failures. It details the Outbox pattern in PostgreSQL for storing messages and the importance of managing processor checkpoints to ensure consistent processing.
PostgreSQL 18 introduces temporal constraints that simplify managing time-related data, allowing developers to maintain referential integrity across temporal relationships with ease. By utilizing GiST indexes and the WITHOUT OVERLAPS constraint, developers can efficiently handle overlapping time periods in applications without complex coding.
AWS's Amazon RDS for PostgreSQL has been found to exhibit Long Fork and G-nonadjacent cycles that violate Snapshot Isolation, indicating a weaker consistency model than standard PostgreSQL. This issue arises from discrepancies in how primary and secondary nodes handle transaction visibility. Users are advised to assess their transaction structures to avoid potential anomalies.
PostgreSQL 18, set for release in September, introduces features aimed at enhancing analytics capabilities and distributed architectures, including a new asynchronous I/O subsystem that significantly boosts performance for analytical workloads. The update also upgrades UUIDs to version 7 to improve database index performance in distributed systems, although some anticipated SQL features will be delayed. Despite its growing popularity among developers, PostgreSQL has traditionally been more associated with online transaction processing rather than analytics.
The article compares the performance of ClickHouse and PostgreSQL, highlighting their strengths and weaknesses in handling analytical queries and data processing. It emphasizes ClickHouse's efficiency in large-scale data management and real-time analytics, making it a suitable choice for high-performance applications.
PostgreSQL is increasingly favored for Kubernetes workloads, now powering 36% of such databases. Azure offers two deployment options for PostgreSQL on AKS: local NVMe for high performance and Premium SSD v2 for optimized cost-performance, enhanced by the CloudNativePG operator for high availability. These innovations simplify the management of stateful applications, making Azure a robust platform for data-intensive workloads.
Efficient storage in PostgreSQL can be achieved by understanding data type alignment and padding bytes. By organizing columns in a specific order, one can minimize space waste while maintaining or even enhancing performance during data retrieval.
PostgreSQL 18 introduces significant enhancements for developers, including native UUID v7 support, virtual generated columns, and improved RETURNING clause functionality. These features aim to streamline development processes and improve database performance. Additionally, the EXPLAIN command now provides default buffer usage information, enhancing query analysis.
PostgreSQL 18 has been released, featuring significant performance improvements through a new asynchronous I/O subsystem, enhanced query execution capabilities, and easier major-version upgrades. The release also introduces new features such as virtual generated columns, OAuth 2.0 authentication support, and improved statistical handling during upgrades, solidifying PostgreSQL's position as a leading open source database solution.
The article delves into the challenges of optimizing a Just-In-Time (JIT) compiler for PostgreSQL, particularly in the context of modern CPU architectures. It explains the importance of out-of-order execution and branch prediction in enhancing performance, and highlights specific optimization techniques that can be applied to the PostgreSQL interpreter to improve execution speed.
PostgreSQL 18 introduces significant improvements to the btree_gist extension, primarily through the implementation of sortsupport, which enhances index building efficiency. These updates enable better performance for use cases such as nearest-neighbour search and exclusion constraints, offering notable gains in query throughput compared to previous versions.
Litestream v0.50 introduces several new features and improvements, enhancing the functionality of the tool for continuous PostgreSQL replication. Key updates include support for incremental backups and improved performance, making it easier for developers to manage database backups effectively. The release emphasizes stability and user experience, ensuring that Litestream remains a reliable choice for PostgreSQL users.
The article discusses performance improvements in pgstream, a tool used for taking snapshots of PostgreSQL databases. It highlights the underlying challenges and solutions implemented to enhance the speed and efficiency of database snapshots, ultimately benefiting users with faster data access and reduced operational overhead.
The article explores the concept of the butterfly effect in the context of PostgreSQL, illustrating how small changes in data can lead to significant impacts within systems. It emphasizes PostgreSQL's responsive nature, likening its behavior to living systems that adapt and react to changes in their environment.
PostgreSQL is set to gain on-disk database encryption through an extension developed by Percona, which aims to provide Transparent Data Encryption (TDE) for enhanced security without vendor lock-in. This feature will help organizations comply with regulations like GDPR by ensuring that sensitive data remains secure even if storage is compromised. Percona plans to collaborate with the community to incorporate TDE into the main PostgreSQL distribution in the future.
PostgreSQL's full-text search (FTS) can be significantly faster than often perceived, achieving a ~50x speed improvement with proper optimization techniques such as pre-calculating `tsvector` columns and configuring GIN indexes correctly. Misleading benchmarks may overlook these optimizations, leading to an unfair comparison with other search solutions. For advanced ranking needs, extensions like VectorChord-BM25 can further enhance performance.
The article explores the use of custom ICU collations with PostgreSQL's citext data type, highlighting performance comparisons between equality, range, and pattern matching operations. It concludes that while custom collations are superior for equality and range queries, citext is more practical for pattern matching until better index support for nondeterministic collations is achieved.
SELECT FOR UPDATE in PostgreSQL is often misused, leading to unnecessary row-level locking that can severely impact application concurrency and performance. Developers are encouraged to opt for FOR NO KEY UPDATE instead, as it aligns better with typical update scenarios and prevents deadlocks. Properly managing lock levels according to actual data manipulation needs can significantly enhance system efficiency.
PostgreSQL v18 introduces the ability to preserve optimizer statistics during major upgrades, enhancing performance and reducing downtime. This feature allows users to export optimizer statistics with `pg_dump` and ensures that statistics remain intact when using `pg_upgrade`, streamlining database upgrades.
Fresha successfully executed a zero-downtime upgrade from PostgreSQL 12 to 17 across over 200 databases by developing a tailored upgrade framework that addressed the complexities of maintaining data consistency and availability during the process. The approach involved leveraging logical replication, managing Debezium connectors, and implementing a two-phase switchover to ensure a seamless transition without disrupting production services.
The CNPG Kubectl Plugin enhances kubectl with PostgreSQL-specific commands, simplifying cluster management tasks such as creating backups, promoting instances, and executing commands directly from the terminal. The article details installation methods, command usage, and provides insights into cluster status and backup processes.
Pgline is a high-performance PostgreSQL driver for Node.js, developed in TypeScript, that implements Pipeline Mode, allowing for efficient concurrent queries with reduced CPU usage. Benchmark tests show Pgline outperforms competitors like Bun SQL, Postgresjs, and Node-postgres in terms of speed and resource efficiency. Installation and usage examples are provided to demonstrate its capabilities.
PgDog is a transaction pooler and logical replication manager for PostgreSQL, designed to shard databases efficiently while handling high volumes of connections. Built in Rust, it offers features like automatic query routing, health monitoring, and supports transaction pooling to optimize database performance. PgDog is open source under the AGPL v3 license, allowing for flexible use and modification.
The article discusses the process of performing PostgreSQL migrations using logical replication. It outlines the benefits of logical replication, including minimal downtime and the ability to replicate specific tables and data, making it a flexible option for database migrations. Additionally, it provides practical guidance on setting up and managing logical replication in PostgreSQL environments.
Pipelining in PostgreSQL allows clients to send multiple queries without waiting for the results of previous ones, significantly improving throughput. Introduced in PostgreSQL 18, this feature enhances the efficiency of query processing, especially when dealing with large batches of data across different network types. Performance tests indicate substantial speed gains, underscoring the benefits of utilizing pipelining in SQL operations.
A comparison of various PostgreSQL versions reveals transaction performance, latency, and transactions per second (TPS) metrics. The data highlights that PostgreSQL version 18 achieves the highest transaction count and TPS, while version 17 shows the lowest performance in these areas. Overall, the newer versions generally perform better in terms of latency and transaction efficiency.
PostgreSQL 18 RC 1 has been released as the first release candidate, with a planned general availability date of September 25, 2025. Users upgrading from earlier versions can utilize major version upgrade strategies, and several bug fixes have been applied since the previous beta version.
Squarespace transitioned from PostgreSQL to CockroachDB for better scalability and performance, leveraging Change Data Capture (CDC) to facilitate near real-time data migration with minimal downtime. They developed a systematic migration strategy that addressed schema compatibility, data integrity, and rollback capabilities, ultimately achieving a seamless cutover with high data freshness.
PgHook is a tool for streaming PostgreSQL change events using logical replication via PgOutput2Json, delivering updates to a specified webhook. It can be run as a lightweight Docker container and requires configuration through environment variables for PostgreSQL connection, publication names, and webhook URL. The project includes detailed setup instructions for both PostgreSQL and Docker, enabling easy integration of real-time data changes into applications.
PostgreSQL's Index Only Scan enhances query performance by allowing data retrieval without accessing the table heap, thus eliminating unnecessary delays. It requires specific index types and query conditions to function effectively, and the concept of a covering index, which includes fields in the index, further optimizes this process. Understanding these features is crucial for backend developers working with PostgreSQL databases.
A benchmark is introduced to evaluate the impact of database performance on user experience in LLM chat interactions, comparing OLAP (ClickHouse) and OLTP (PostgreSQL) using various query patterns. Results show ClickHouse significantly outperforms PostgreSQL on larger datasets, with performance tests ranging from 10k to 10m records included in the repository. Users can run tests and simulations using provided scripts to further explore database performance and interaction latencies.
Understanding the fundamentals of PostgreSQL can significantly enhance your workflow by demystifying its operations, which fundamentally revolve around file manipulation. By moving beyond the default package manager installations and engaging with the system manually, users can improve debugging, provisioning, and overall control of their database environment. Embracing this approach allows for a more confident and efficient development experience.
The blog post announces the release of pg_parquet version 0.4, which introduces support for Google Cloud Storage and HTTPS storage. It highlights key improvements and features that enhance usability and performance for users dealing with Parquet data formats in PostgreSQL.
VACUUM is a crucial maintenance task in PostgreSQL, with optimizations like skipping index cleaning to enhance performance. This article explains the implications of the "skip indexes" optimization, particularly how it affects the visibility of dead tuples during database maintenance operations.
Selecting the right storage option in PostgreSQL can significantly affect performance and data management. This article explores various storage methods, including heap and columnar storage, CSV, and Parquet files, highlighting their advantages and use cases for efficient data archiving and retrieval.
pgactive is a PostgreSQL extension designed for active-active database replication, allowing multiple instances within a cluster to accept changes simultaneously. This approach enables various use cases, such as multi-region high availability and reducing write latency, but requires applications to manage complexities like conflicting changes and replication lag. Logical replication, introduced in PostgreSQL 10, is a key component for implementing this topology, while additional features are necessary for full support.
Data types significantly influence the performance and efficiency of indexing in PostgreSQL. The article explores how different data types, such as integers, floating points, and text, affect the time required to create indexes, emphasizing the importance of choosing the right data type for optimal performance.
OpenAI relies heavily on PostgreSQL as the backbone for its services, necessitating effective scalability and reliability measures. The article discusses optimizations implemented by OpenAI, including load management, query optimization, and addressing single points of failure, alongside insights into past incidents and feature requests for PostgreSQL enhancements.
PostgreSQL's performance can significantly benefit from having an adequate amount of RAM, particularly through the management of its shared buffers. Understanding the clock sweep algorithm and how buffer eviction works is crucial for optimizing memory settings, especially in systems with large RAM capacities. Proper sizing of the shared buffer, typically recommended at 25% of total RAM, is essential for achieving optimal database performance.
PostgreSQL 18 introduces asynchronous I/O (AIO), enhancing storage utilization and performance. The article provides tuning advice for two key parameters, io_method and io_workers, highlighting that while io_method defaults to worker for broad compatibility, increasing io_workers beyond the default of three can significantly improve performance on larger machines. Trade-offs between different AIO methods and their impact on performance are also discussed.
Properly collecting optimizer statistics for partitioned tables in PostgreSQL is crucial for accurate query planning and execution performance. The article discusses the significance of these statistics, how they differ from partition statistics, and the role of the `ANALYZE` command in enhancing query efficiency.
Git can serve as an unconventional database alternative for certain projects, offering features like built-in versioning, atomic transactions, and fast data retrieval, although it has notable limitations compared to traditional databases like PostgreSQL. The article explores Git's internal architecture through the creation of a todo application, demonstrating its capabilities and potential use cases. However, for production applications, utilizing established database services is recommended.
The article discusses the importance of testing rules in PostgreSQL to ensure data integrity and performance. It highlights various strategies and best practices for implementing effective testing frameworks within PostgreSQL environments.
Understanding when to rebuild PostgreSQL indexes is crucial for maintaining database performance. The decision depends on index type, bloat levels, and performance metrics, with recommendations to use the `pgstattuple` extension to assess index health before initiating a rebuild. Regular automatic rebuilds are generally unnecessary and can waste resources.
SQL query optimization involves the DBMS determining the most efficient plan to execute a query, with the query optimizer responsible for evaluating different execution plans based on cost. The Plan Explorer tool, implemented for PostgreSQL, visualizes these plans and provides insights into the optimizer's decisions by generating various diagrams. The tool can operate in both standalone and server modes, enabling deeper analysis of query execution and costs.
The Spock extension enables multi-master replication for PostgreSQL versions 15 and later, allowing users to build and manage a cluster with identical databases that support logical decoding. Key steps include configuring PostgreSQL settings, creating nodes and subscriptions, and ensuring proper connectivity between nodes. Documentation and deployment can be managed through tools like MkDocs and containerization options are available for easier implementation.
PostgreSQL 18 Beta 1 has been released, offering a preview of new features aimed at improving performance and usability, including an asynchronous I/O subsystem, enhanced query optimization, and better upgrade tools. The PostgreSQL community is encouraged to test the beta version to help identify bugs and contribute feedback before the final release expected later in 2025. Full details and documentation can be found in the release notes linked in the announcement.
Greenmask is an open-source utility for logical database backup, anonymization, and synthetic data generation that operates without requiring changes to the existing database schema. It offers advanced features like database subset systems, deterministic transformers, and dynamic parameters, while ensuring compatibility with PostgreSQL utilities for seamless integration. The tool is designed to be stateless, cross-platform, and extensible, making it ideal for various data management scenarios.
Flame graphs visually represent where a program consumes processing time, utilizing sampled call stack data collected by a profiler. This blog post discusses the creation and use of flame graphs for diagnosing performance bottlenecks in PostgreSQL, detailing methods for data collection and processing, and highlighting the importance of build types in profiling.
Researchers from Carnegie Mellon University have developed a vector-based automated tuning system called Proto-X for PostgreSQL databases, which can enhance performance by two to ten times. By utilizing a holistic tuning approach and an LLM booster, the system can significantly reduce the time needed for optimization from 12 hours to about 50 minutes, making database management easier for developers with less experience.