hot_standby_feedback Bloat Trap: Fixing Replica Conflicts by Slowly Killing the Primary
I’ve seen this play out like a slow-motion outage:
- Read replica queries get canceled due to recovery conflicts.
- Someone enables
hot_standby_feedback=onto “fix it”. - Cancellations stop. Everyone celebrates.
- A week later, the primary is slower, disk is growing, autovacuum can’t keep up.
- Eventually you hit a latency incident that looks unrelated… but isn’t.
This article is the runbook for that trap: how to prove hot_standby_feedback is pinning your xmin horizon, why it creates bloat, and what I change in production so replicas remain useful without silently destroying the primary.
Tested on: PostgreSQL 15–17, physical streaming replication, mixed OLTP + “someone ran analytics on the replica” workloads.
Incident narrative (anonymized)
We had a busy primary and one read replica used for heavy dashboard queries. During peak traffic, the replica frequently canceled long queries (recovery conflicts). The “fix” was applied:
hot_standby_feedback = on
The cancellations stopped, but about 10 days later:
- primary disk usage grew steadily
- index bloat increased
- p95 query latency on the primary drifted up
- autovacuum was running constantly but never “winning”
Blast radius: primary performance degraded, and we got paged for latency (not for replication).
Constraint: We couldn’t just disable the replica or stop analytical queries overnight. We needed a plan that kept read workloads functional without turning the primary into a bloat farm.
Timeline
- T-0: “Replica cancels queries” complaint.
- T+1h:
hot_standby_feedback=onenabled; cancellations drop. - T+7d: primary disk trends up; vacuum activity increases.
- T+10d: primary latency incident; autovacuum behind.
- T+11d: we correlate bloat with
backend_xminheld back by the replica. - T+12d: mitigation: disable feedback on the “primary-serving” replica, enforce timeouts, add a dedicated analytics replica with explicit guardrails.
- T+14d: bloat stops growing; primary latency recovers.
Mechanism: why hot_standby_feedback creates bloat
Replica conflicts happen because vacuum wants to clean, replica wants a snapshot
On a standby, queries run against a consistent snapshot while WAL is being replayed. Sometimes WAL replay needs to remove row versions or lock structures that a query still needs → conflict → standby cancels the query (or delays replay).
hot_standby_feedback flips the tradeoff
When hot_standby_feedback is enabled, the standby tells the primary about its xmin horizon. The primary then avoids vacuum cleanup that would break the standby’s snapshot.
That prevents cancellations, but the primary pays:
- dead tuples remain longer
- indexes keep references longer
- autovacuum cannot reclaim space effectively
- bloat accumulates
This is not “a little overhead”. Under long-running queries or high churn, it’s a steady bloat pump.
Runbook: detecting xmin pinning and bloat pressure
What to check first
1) Is the standby actually canceling queries (or delaying replay)?
On the replica:
SELECT
datname,
confl_snapshot,
confl_lock,
confl_bufferpin,
confl_deadlock
FROM pg_stat_database_conflicts
ORDER BY (confl_snapshot + confl_lock + confl_bufferpin + confl_deadlock) DESC;
If confl_snapshot is increasing, long snapshots are the trigger.
2) Is the primary being held back by the standby?
On the primary:
SELECT
application_name,
state,
sync_state,
write_lag,
flush_lag,
replay_lag,
backend_xmin
FROM pg_stat_replication
ORDER BY application_name;
If backend_xmin is non-null and persistently “old”, that’s a strong signal the replica is pinning cleanup.
3) Are dead tuples and table growth trending up?
Quick table-level signal:
SELECT
schemaname,
relname,
n_live_tup,
n_dead_tup,
round(100.0 * n_dead_tup / (n_live_tup + n_dead_tup + 1), 2) AS dead_pct
FROM pg_stat_all_tables
ORDER BY n_dead_tup DESC
LIMIT 20;
And size:
SELECT
n.nspname,
c.relname,
pg_size_pretty(pg_total_relation_size(c.oid)) AS total_size
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
ORDER BY pg_total_relation_size(c.oid) DESC
LIMIT 20;
How to confirm the hypothesis
The “smoking gun” pattern I look for:
- long queries on the replica
hot_standby_feedback=onbackend_xminheld back on primary- dead tuples grow and vacuum doesn’t reduce them
- primary latency drifts upward (especially for index-heavy queries)
If you see all of those, you’re not guessing. You’re paying for feedback.
Mitigations: safe vs risky
Safe mitigations
- Set a timeout on the replica If the replica is for dashboards, you rarely need 30-minute queries.
On the replica (example):
ALTER DATABASE appdb SET statement_timeout = '30s';
-
Disable feedback on the replica that must not harm the primary Turn
hot_standby_feedbackoff for the general-purpose replica. -
Add a dedicated analytics replica (optional) If you truly need long queries, dedicate a replica for it and accept the cost — explicitly and with monitoring.
-
Use
max_standby_streaming_delayintentionally Allow some replay delay instead of pinning the primary indefinitely. (This shifts pain to replica staleness, which might be acceptable for analytics.)
Risky mitigations
- Enabling
hot_standby_feedbackon every standby (bloat farm) VACUUM FULLduring peak (locking + rewrites)- “Fix bloat” with aggressive manual vacuum everywhere (can spike IO and latency)
What we changed (concrete)
1) We stopped treating the replica as “free analytics”
We split read replicas into roles:
- Replica A (serving reads, safe for primary): feedback off, strict timeouts
- Replica B (analytics): feedback on, but guarded (timeouts + monitoring + expectations about bloat)
2) Config changes (illustrative)
On Replica A:
hot_standby_feedback = off
statement_timeout = 30s
On Replica B (analytics):
hot_standby_feedback = on
statement_timeout = 2min
3) Autovacuum tuning on the worst tables
We tuned high-churn tables so vacuum runs more aggressively (per-table, not global sledgehammer):
ALTER TABLE public.events SET (
autovacuum_vacuum_scale_factor = 0.02,
autovacuum_vacuum_threshold = 5000
);
How to verify (measurable)
backend_xminstops being pinned On the primary:
SELECT application_name, backend_xmin FROM pg_stat_replication;
Expected:
- Replica A no longer holds it back
- Replica B may, but you’ll see it explicitly and can budget for it
-
Dead tuples trend down (or at least stop exploding) Re-run the
pg_stat_all_tablesquery above over time. -
Primary latency recovers
- p95 query time improves
- autovacuum can reclaim space again
- Replica conflicts return, but within policy If Replica A cancels an occasional long query, that’s acceptable — because the alternative is degrading the primary.
Prevention / guardrails
- Replica role contract
- serving replica must not pin xmin for long
- analytics replica must have explicit cost budget and query timeouts
- Bloat budget
- alert on dead tuple ratio for top tables
- alert on rapid table/index size growth
- Replica conflict budget
- cancellations are acceptable up to N/day on the serving replica
- Runbook
- “If conflicts rise: shorten queries, don’t blindly turn on feedback”
Related reading
- PostgreSQL Read Replica Conflicts: Why Your Queries Get Canceled
- PostgreSQL Autovacuum SLO Tuning: How to Configure Vacuum for 200M Rows and 5k UPSERT/s
- PostgreSQL Idle in Transaction: Emergency Playbook for Stuck Connections
- PostgreSQL Replication Slot Bloat: How One Inactive Slot Filled 500GB Disk
- Logical Replication Slot WAL Bloat: When Subscribers Go Offline
- PostgreSQL HOT Updates + FILLFACTOR: How to Reduce Index Bloat by 60%
- PostgreSQL Checkpoint Spikes: Why p99 Explodes Every N Minutes
Related posts
PostgreSQL XID Wraparound: Emergency Playbook for Vacuum Freeze Under Fire
PostgreSQL can go read-only near XID wraparound. Use this emergency playbook to find the oldest tables, unblock vacuum freeze, and prevent repeat incidents.
PostgreSQL Logical Replication Lag: Big Transactions and Reorder Buffer Spills
One huge transaction can pin logical replication for hours. Runbook to detect the blocker, tune decoding safely, and enforce bounded transactions in prod.
Logical Replication Slot WAL Bloat: When Subscribers Go Offline
Disk filling up with WAL files. The cause: a logical replication slot consumer went offline, and PostgreSQL retains all WAL since then because it might be needed.
PostgreSQL Autovacuum SLO Tuning: How to Configure Vacuum for 200M Rows and 5k UPSERT/s
Autovacuum is either ignored or cargo-cult tuned. I'll show how to turn it into an SLO-driven system with specific numbers, pg_stat metrics, and reproducible tests.
Cite this article
If you reference this post, please link to the original URL and credit the author.