PostgreSQL Checkpoint Spikes: Why p99 Explodes Every N Minutes
A reproducible approach to diagnose and eliminate checkpoint-induced latency spikes using pgbench, pg_stat_bgwriter, and WAL/IO budgeting.
4 posts
A reproducible approach to diagnose and eliminate checkpoint-induced latency spikes using pgbench, pg_stat_bgwriter, and WAL/IO budgeting.
App hangs but the database looks healthy. Your pool is exhausted. I show how to detect it, size pools sanely, and prevent connection leaks.
ReplacingMergeTree doesn't dedupe on SELECT. It merges eventually. Your queries return duplicates until background merge runs. Here's how to handle it.
Perfect idempotency logic, but customers still get charged twice. The cause: checking idempotency keys against a read replica that's seconds behind the primary during traffic spikes.