#postgresql

32 posts

PostgreSQL Logical Replication Lag: Big Transactions and Reorder Buffer Spills

One huge transaction can pin logical replication for hours. Runbook to detect the blocker, tune decoding safely, and enforce bounded transactions in prod.

January 1, 2026

EXPLAIN Lied to You: The PostgreSQL Prepared Statement Plan Cliff

Your EXPLAIN looks perfect but production melts. The culprit: PostgreSQL silently switched from a custom plan to a generic plan after enough executions, and the generic plan is catastrophically wrong.

December 24, 2025

Works in psql, Flaky in Prod: PgBouncer's Silent Murder of LISTEN/NOTIFY

PostgreSQL LISTEN/NOTIFY works perfectly in local testing but notifications randomly stop arriving in production. The culprit: transaction pooling quietly reassigning your connection to someone else.

December 18, 2025

PostgreSQL XID Wraparound: Emergency Playbook for Vacuum Freeze Under Fire

PostgreSQL can go read-only near XID wraparound. Use this emergency playbook to find the oldest tables, unblock vacuum freeze, and prevent repeat incidents.

December 16, 2025

hot_standby_feedback Bloat Trap: Fixing Replica Conflicts by Slowly Killing the Primary

hot_standby_feedback stops replica query cancellations but can bloat the primary over days. Detect xmin pinning, mitigate safely, and add guardrails.

December 12, 2025

PostgreSQL Checkpoint Spikes: Why p99 Explodes Every N Minutes

A reproducible approach to diagnose and eliminate checkpoint-induced latency spikes using pgbench, pg_stat_bgwriter, and WAL/IO budgeting.

December 8, 2025

Database Connection Pool Exhaustion: The Silent Outage Trigger

App hangs but the database looks healthy. Your pool is exhausted. I show how to detect it, size pools sanely, and prevent connection leaks.

November 30, 2025

pg_waldump WAL Forensics: Reconstructing What Happened to Your Data

Something deleted rows from production but nobody admits to running DELETE. Use pg_waldump to analyze WAL files and reconstruct exactly what happened and when.

November 24, 2025

Connection Pool Sizing with Little's Law: Mathematical Approach to HikariCP and PgBouncer

Pool size 50 because that's how it's always been? I'll show how to use Little's Law to calculate optimal pool size and prove it with load tests.

October 22, 2025

UUIDv4 vs ULID vs TSID: Impact on PostgreSQL B-Tree Indexes After 100M Records

Random UUIDs as Primary Keys cause index bloat and random I/O. Benchmark with specific numbers - index size, cache hit ratio, and WAL volume after 100M inserts.

October 14, 2025

PostgreSQL HOT Updates + FILLFACTOR: How to Reduce Index Bloat by 60%

Vacuum runs successfully but disk keeps growing and cache hit ratio drops. I'll show how to quantify HOT-update eligibility using pgstattuple and optimize fillfactor.

September 23, 2025

When Prepared Statements Make PostgreSQL 10× Slower: Generic Plan Trap

Same query, same params, but prod is slow and staging works fine. I'll show how to reproduce the generic plan problem with pgBouncer, Java/Go and how to fix it.

September 15, 2025

Logical Replication Slot WAL Bloat: When Subscribers Go Offline

Disk filling up with WAL files. The cause: a logical replication slot consumer went offline, and PostgreSQL retains all WAL since then because it might be needed.

September 9, 2025

PostgreSQL Autovacuum SLO Tuning: How to Configure Vacuum for 200M Rows and 5k UPSERT/s

Autovacuum is either ignored or cargo-cult tuned. I'll show how to turn it into an SLO-driven system with specific numbers, pg_stat metrics, and reproducible tests.

September 4, 2025

Zero-Downtime PostgreSQL Migrations: Expand/Contract, Backfill and Rollback Strategies

A practical playbook for safe database migrations in production. From expand/contract pattern through online indexes to monitoring and rollback.

July 29, 2025

Redlock vs PostgreSQL Advisory Locks: When You Don't Need Redis for Distributed Locking

Adding Redis just for distributed locks? PostgreSQL advisory locks might be enough. I compare both with failure scenarios and performance benchmarks.

July 13, 2025

PostgreSQL TOAST Strategy: Why Your JSON Column Kills Query Performance

SELECT * on a table with JSON is 10x slower than expected. I'll show how TOAST storage works and when to change strategies for large columns.

June 24, 2025

PostgreSQL Replication Slot Bloat: How One Inactive Slot Filled 500GB Disk

Disk is 95% full, WAL directory has 400GB. I'll show how replication slots prevent WAL cleanup and a playbook for prevention and recovery.

June 8, 2025

PostgreSQL Idle in Transaction: Emergency Playbook for Stuck Connections

Autovacuum can't run, table bloat growing, all because of one 'idle in transaction' connection. Here's the detection and kill playbook.

May 20, 2025

Stop Mocking Your Database: Integration Tests in the Testcontainers Era

Why mocks lie and how Testcontainers will change your testing approach. Practical examples, CI setup, and data isolation strategies.

April 24, 2025

GIN Index Pending List Overflow: Fast Writes, Slow Searches

Full-text search was fast, now it's slow. The cause: GIN index pending list grew huge during bulk inserts, and every search must now scan the unsorted pending entries.

April 17, 2025

Kubernetes Rollout Without DB Outage: How to Stop PostgreSQL Connection Storm

Reproducible lab demonstrating connection storm during K8s rollouts. PgBouncer, preStop hooks and jitter - practical solutions with benchmarks.

April 1, 2025

Transactional Outbox: Solving the Dual Write Problem Without 2PC

Practical Outbox pattern implementation in Node.js/TypeScript with PostgreSQL LISTEN/NOTIFY. Race-condition case study and production-ready solution.

March 27, 2025

The Soft Delete Trap: Why is_deleted Kills Your Database (And What To Do)

A practical analysis of why soft delete destroys database performance over time. Benchmarks, partitioning solution, and migration checklist.

March 23, 2025

ICU Collation Version Drift: When Database Upgrades Break Your Indexes

Query returns wrong results after OS upgrade. The cause: ICU library version changed, collation rules shifted, and indexes are now sorted inconsistently with the new sort order.

March 15, 2025

PostgreSQL Partial Index: Planner Ignores Your Index

Query scans full table despite perfect partial index. The cause: query's WHERE clause doesn't match the index predicate exactly, or statistics mislead the planner.

March 4, 2025

Double Charges From Idempotency Keys: The Replica Lag Trap

Perfect idempotency logic, but customers still get charged twice. The cause: checking idempotency keys against a read replica that's seconds behind the primary during traffic spikes.

January 29, 2025

PostgreSQL Read Replica Conflicts: Why Your Queries Get Canceled

Queries on read replicas fail with 'canceling statement due to conflict with recovery'. The fix depends on which of the 5 conflict types you have - here's how to diagnose and solve each one.

January 28, 2025

PostgreSQL Serialization Failures: Beyond 'Just Retry'

Getting 'could not serialize access due to concurrent update'? The fix isn't just retry logic - it's understanding when to use which isolation level and how to reduce conflict frequency.

January 15, 2025

PostgreSQL OOM by Design: work_mem × Parallel Workers × Plan Nodes

work_mem looks small at 256MB, but a parallel hash join with 4 workers across 3 plan nodes uses 3GB. Here's how to prevent PostgreSQL from legitimately OOMing your container.

December 28, 2024

The Index That Killed Write Performance: Losing PostgreSQL HOT Updates

Adding an index for performance made writes 10x slower. The counter-intuitive cause: the new index broke HOT updates, turning cheap in-place updates into full-row rewrites with massive bloat.

December 19, 2024

PostgreSQL 'cached plan must not change result type' During Zero-Downtime Migrations

Rolling deploy fails with cached plan errors after ALTER TABLE. The cause: server-side prepared statements cache query plans that break when schema changes.

December 11, 2024