PostgreSQL Checkpoint Spikes: Why p99 Explodes Every N Minutes

This is a classic “mystery graph”:

CPU looks stable
throughput looks steady
but tail latency (especially p99) has periodic spikes — every 5 or 15 minutes
storage latency rises during the spikes

Very often the root cause is: checkpoints.

Not “checkpoints are bad”, but checkpoints can turn “dirty page flushing” into a short IO burst if your configuration and storage throughput don’t match your write rate.

Goal of this post: a methodology to reproduce, measure, and tune checkpoints until they stop being a p99 killer.

Tested on: PostgreSQL 13–16, both local NVMe and network-attached disks (cloud). Examples use Linux tools.

What a checkpoint does (only what matters for performance)

Operationally, a checkpoint means Postgres must ensure that “a certain point in time is safely on disk”.

In practice that is:

writing many dirty buffers,
and syncing them (fsync),
which can become a burst that competes with normal query IO.

If your storage queue fills up, even read-only queries can slow down because they wait behind checkpoint writes.

Method: measure first, tune second

Minimum signals you want

PostgreSQL checkpoint/bgwriter stats
WAL rate (how fast you generate it)
OS disk latency/queue (e.g. iostat)
Workload latency (pgbench or your service SLI)

SQL: checkpoint and bgwriter statistics

Start with pg_stat_bgwriter:

SELECT
  checkpoints_timed,
  checkpoints_req,
  checkpoint_write_time,
  checkpoint_sync_time,
  buffers_checkpoint,
  buffers_clean,
  maxwritten_clean,
  buffers_backend,
  buffers_backend_fsync
FROM pg_stat_bgwriter;

Practical interpretation:

checkpoints_timed vs checkpoints_req: if checkpoints_req grows fast, you’re often doing forced checkpoints (WAL fills up before the timeout).
checkpoint_write_time and checkpoint_sync_time: spikes here often correlate with p99 spikes.

If you’re on PostgreSQL 16+, pg_stat_io can add more detail — but you can do a solid diagnosis without it.

OS: disk queue and latency

On the DB node:

iostat -xz 1

Watch for:

sustained %util near 100%
await spikes during checkpoint windows
queue indicators (platform-dependent)

A reproducible lab (on purpose)

Do this on a test DB, not production.

1) Generate a steady workload

pgbench -i -s 50 mydb
pgbench -c 32 -j 32 -T 300 -P 1 mydb

-P 1 prints periodic latency/throughput so you can align it with checkpoint stats.

2) Make checkpoints painful (lab-only)

The idea is to create conditions that force frequent checkpoints (for example: low max_wal_size or short checkpoint_timeout) and observe:

p99 spikes line up with checkpoint write/sync time
storage latency spikes at the same time

Avoid blindly copying “recommended values”. The point is to learn the shape of the problem with your storage.

3) Correlate p99 with checkpoint signals

During the test:

log pgbench latency
sample pg_stat_bgwriter
watch iostat

If spikes line up with checkpoint_write_time/checkpoint_sync_time and disk latency, you’ve found the culprit.

The checkpoint budget: a reality check

To stop checkpoint bursts, you must align:

your write rate / WAL rate
with your storage throughput

If WAL is generated quickly and max_wal_size is small, checkpoints will be frequent and often forced.

Tuning goal is not “the fewest checkpoints”. It’s:

predictable, spread out checkpoint work
and storage latency that stays within your p99 budget

Tuning: what to try (and how to verify)

1) Reduce forced checkpoints via WAL sizing

If checkpoints_req dominates, you’re likely hitting the WAL size limit before the timeout.

Direction:

increase max_wal_size (within disk constraints)

Verify:

checkpoints_req slows down relative to checkpoints_timed
disk latency spikes become smaller or less frequent

2) Spread checkpoint IO over time

checkpoint_completion_target exists so the system can spread work across more of the interval.

Verify:

fewer short IO bursts
smoother await in iostat
reduced p99 spikes

3) Storage is sometimes the real limit

Cloud disks often have burst behavior and then throttling.

If your spikes align with storage throttling, DB tuning can only do so much — you may need:

a higher disk tier
different disk layout
or architectural changes (write shaping, batching, buffering)

Common traps

“CPU is fine, so it’s not Postgres”

Checkpoint spikes are primarily IO-driven. CPU can look perfect while latency collapses.

“Just increase checkpoint_timeout”

It can help, but if you’re constrained by max_wal_size, checkpoints will still be forced.

“We tuned the queries, but the spikes remain”

If disk queue is saturated, query tuning doesn’t help. You must fix IO contention.

What I’d do in production

Build correlations: p99 spikes ↔ checkpoint stats ↔ disk latency
Check checkpoints_timed vs checkpoints_req
If forced checkpoints dominate, address WAL sizing and storage limits
Define a checkpoint budget (IO stability, predictable checkpoints, alerts)
Change one thing at a time and verify with metrics

FAQ

How do I know if checkpoints are forced?

If checkpoints_req grows quickly compared to checkpoints_timed, you’re often forcing checkpoints.

Why does a checkpoint slow down read-only queries?

Because reads also wait on the disk. If checkpoint writes saturate storage, reads queue behind them.

Is increasing max_wal_size always the answer?

Not always. It reduces checkpoint frequency, but if storage can’t sustain the spread-out flushing either, you still need better IO capacity.

Can archiving or replication change the behavior?

Yes. If your WAL pipeline is constrained, the system dynamics change. Measure WAL rate and replication/archiving lag too.

/en/blog/postgresql-wal-forensics/ (WAL tooling and what it reveals)
/en/blog/logical-replication-slot-wal-retention/ (WAL retention pressure)
/en/blog/postgresql-autovacuum-slo/ (another periodic performance killer)

PostgreSQL Checkpoint Spikes: Why p99 Explodes Every N Minutes

What a checkpoint does (only what matters for performance)

Method: measure first, tune second

Minimum signals you want

SQL: checkpoint and bgwriter statistics

OS: disk queue and latency

A reproducible lab (on purpose)

1) Generate a steady workload

2) Make checkpoints painful (lab-only)

3) Correlate p99 with checkpoint signals

The checkpoint budget: a reality check

Tuning: what to try (and how to verify)

1) Reduce forced checkpoints via WAL sizing

2) Spread checkpoint IO over time

3) Storage is sometimes the real limit

Common traps

“CPU is fine, so it’s not Postgres”

“Just increase checkpoint_timeout”

“We tuned the queries, but the spikes remain”

What I’d do in production

FAQ

How do I know if checkpoints are forced?

Why does a checkpoint slow down read-only queries?

Is increasing max_wal_size always the answer?

Can archiving or replication change the behavior?

Further reading

Related posts

Cite this article

What a checkpoint does (only what matters for performance)

Method: measure first, tune second

Minimum signals you want

SQL: checkpoint and bgwriter statistics

OS: disk queue and latency

A reproducible lab (on purpose)

1) Generate a steady workload

2) Make checkpoints painful (lab-only)

3) Correlate p99 with checkpoint signals

The checkpoint budget: a reality check

Tuning: what to try (and how to verify)

1) Reduce forced checkpoints via WAL sizing

2) Spread checkpoint IO over time

3) Storage is sometimes the real limit

Common traps

“CPU is fine, so it’s not Postgres”

“Just increase checkpoint_timeout”

“We tuned the queries, but the spikes remain”

What I’d do in production

FAQ

How do I know if checkpoints are forced?

Why does a checkpoint slow down read-only queries?

Is increasing max_wal_size always the answer?

Can archiving or replication change the behavior?

Related reading

Further reading

Related posts

Cite this article