When Prepared Statements Make PostgreSQL 10× Slower: Generic Plan Trap
Prepared statements are great until they lock you into a bad plan. “Same query, same parameters, but production takes 800ms and staging 15ms.” I heard this from a colleague who spent two days debugging. The staging database was a fresh copy of production. The query was identical. The parameters were identical. But production was consistently 50x slower.
The culprit was PostgreSQL’s generic plan cache—a performance optimization that, with the wrong data distribution, becomes a performance catastrophe. After executing a prepared statement five times, PostgreSQL decides it’s not worth re-planning for each parameter value and switches to a “generic” plan. This generic plan is based on average statistics, not the specific values you’re querying for.
This optimization works beautifully when data is uniformly distributed. If every value of status has roughly the same selectivity, the generic plan is optimal. But when data is skewed—99% completed, 1% pending, 0.01% failed—the generic plan picks a strategy that’s average, which is terrible for the rare values you’re actually querying.
The reason staging worked was timing: my colleague ran his test queries a few times, and the statement never hit the 5-execution threshold that triggers the generic plan switch. Production had been running for weeks, with the problematic query stuck in its slow generic plan forever.
Tested on: PostgreSQL 16.1, pgBouncer 1.21, HikariCP 5.1, Go pgx 5.5
What is Generic vs Custom Plan
PostgreSQL has two types of execution plans for prepared statements, and understanding the difference is key to diagnosing this issue.
When you execute a prepared statement, PostgreSQL has a choice: it can create a new plan for the specific parameter values you provided (custom plan), or it can reuse a cached plan that was created without knowing the specific values (generic plan).
Custom plans are slower to create because the planner runs fresh each time, but they’re optimal because the planner knows exactly what you’re querying for. Generic plans are faster because there’s no planning overhead, but they may be suboptimal because the planner had to guess based on average statistics.
Custom Plan (slower planning, optimal plan)
-- Each call: planner sees concrete values
PREPARE get_user AS SELECT * FROM users WHERE status = $1;
EXECUTE get_user('active'); -- Planner knows: status = 'active', 95% of rows
EXECUTE get_user('banned'); -- Planner knows: status = 'banned', 0.1% of rows
Generic Plan (faster planning, possibly suboptimal)
-- After 5 executions: PostgreSQL switches to generic plan
-- Planner DOESN'T see concrete values, uses statistics
EXECUTE get_user('active'); -- Uses average estimate
EXECUTE get_user('banned'); -- Uses THE SAME plan!
Problem: Skewed Data Distribution
-- Table with 10M rows
CREATE TABLE orders (
id bigserial PRIMARY KEY,
status varchar(20),
customer_id bigint,
created_at timestamp
);
-- Status distribution (skewed!)
-- 'completed': 9,500,000 (95%)
-- 'pending': 450,000 (4.5%)
-- 'processing': 49,000 (0.49%)
-- 'failed': 1,000 (0.01%)
CREATE INDEX idx_orders_status ON orders(status);
Custom Plan for ‘failed’ (0.01%)
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'failed';
-- Index Scan using idx_orders_status
-- Rows: 1,000
-- Time: 2ms
Generic Plan (average estimate)
-- Generic plan estimates: status has 4 distinct values
-- Average estimate: 10M / 4 = 2.5M rows
-- With 2.5M rows, Seq Scan is "better" than Index Scan
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = $1;
-- Seq Scan on orders
-- Rows: 2,500,000 (estimate) vs 1,000 (reality)
-- Time: 2500ms vs 2ms
Reproducible Lab
Setup
-- test_prepared_statements.sql
CREATE TABLE test_skewed (
id bigserial PRIMARY KEY,
category varchar(10),
data text
);
-- Skewed distribution
INSERT INTO test_skewed (category, data)
SELECT
CASE
WHEN random() < 0.99 THEN 'common'
WHEN random() < 0.999 THEN 'rare'
ELSE 'unicorn'
END,
md5(random()::text)
FROM generate_series(1, 10000000);
CREATE INDEX idx_category ON test_skewed(category);
ANALYZE test_skewed;
-- Verify distribution
SELECT category, count(*), round(100.0 * count(*) / sum(count(*)) over(), 2) as pct
FROM test_skewed
GROUP BY category;
-- common: 9,900,000 (99%)
-- rare: 99,000 (0.99%)
-- unicorn: 1,000 (0.01%)
Test Script (Python)
# test_generic_plan.py
import psycopg2
import time
conn = psycopg2.connect("postgresql://localhost/testdb")
cur = conn.cursor()
# Prepared statement
cur.execute("PREPARE test_stmt AS SELECT * FROM test_skewed WHERE category = $1")
results = []
for i in range(20):
for category in ['unicorn', 'common', 'rare']:
start = time.time()
cur.execute("EXECUTE test_stmt(%s)", (category,))
rows = cur.fetchall()
elapsed = (time.time() - start) * 1000
results.append({
'iteration': i,
'category': category,
'rows': len(rows),
'time_ms': round(elapsed, 2)
})
# After 5 iterations switches to generic plan
if i == 5:
print(f"--- GENERIC PLAN KICKS IN ---")
# Output
for r in results:
print(f"Iter {r['iteration']:2d} | {r['category']:8s} | {r['rows']:8d} rows | {r['time_ms']:8.2f}ms")
Results
Iter 0 | unicorn | 1000 rows | 2.34ms # custom plan
Iter 1 | unicorn | 1000 rows | 1.89ms
...
Iter 4 | unicorn | 1000 rows | 2.01ms
--- GENERIC PLAN KICKS IN ---
Iter 5 | unicorn | 1000 rows | 890.45ms # 400× slower!
Iter 6 | unicorn | 1000 rows | 912.23ms
pgBouncer Complication
Transaction Pooling Mode
# pgbouncer.ini
[pgbouncer]
pool_mode = transaction # Problem!
In transaction mode:
- Each transaction can go to a different connection
- Prepared statements are per-connection
- Statement prepared on conn A doesn’t exist on conn B
- Some drivers “simulate” prepared statements
Java HikariCP + pgBouncer
// Problem: HikariCP uses prepared statements by default
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://pgbouncer:6432/mydb");
// Fix 1: Disable server-side prepared statements
config.addDataSourceProperty("prepareThreshold", "0");
// Fix 2: Or use JDBC parameter
// jdbc:postgresql://host/db?prepareThreshold=0
Go pgx
// pgx uses prepared statements by default
config, _ := pgxpool.ParseConfig(connString)
// Fix: Disable prepared statement cache
config.ConnConfig.DefaultQueryExecMode = pgx.QueryExecModeSimpleProtocol
// Or per-query
rows, err := pool.Query(ctx, sql, pgx.QueryExecModeSimpleProtocol, args...)
Solutions
1. Force Custom Plan (PostgreSQL 12+)
-- Session level
SET plan_cache_mode = 'force_custom_plan';
-- Or per-query (PostgreSQL 16+)
SELECT /*+ CustomPlan */ * FROM orders WHERE status = $1;
2. Increase Threshold Before Generic Plan
-- Default is 5, increase to higher value
-- -1 = never use generic plan
ALTER SYSTEM SET plan_cache_mode = 'force_custom_plan';
3. Partial Indexes for Skewed Data
-- Instead of single index
CREATE INDEX idx_orders_status ON orders(status);
-- Create partial indexes for rare values
CREATE INDEX idx_orders_failed ON orders(id) WHERE status = 'failed';
CREATE INDEX idx_orders_processing ON orders(id) WHERE status = 'processing';
-- Generic plan now uses correct index
4. Application-Level Routing
// For critical queries use explicit SQL
public List<Order> getFailedOrders() {
// This query always uses optimal plan
return jdbcTemplate.query(
"SELECT * FROM orders WHERE status = 'failed'", // no parameter!
orderRowMapper
);
}
// For common queries parameter is OK
public List<Order> getOrdersByStatus(String status) {
return jdbcTemplate.query(
"SELECT * FROM orders WHERE status = ?",
orderRowMapper,
status
);
}
Diagnostics
Check if Query Uses Generic Plan
-- pg_stat_statements
SELECT
query,
calls,
mean_exec_time,
stddev_exec_time,
-- High stddev = possible generic plan problem
stddev_exec_time / mean_exec_time as coefficient_of_variation
FROM pg_stat_statements
WHERE query LIKE '%orders%'
ORDER BY coefficient_of_variation DESC;
EXPLAIN with Concrete vs Generic Parameters
-- Custom plan (with values)
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE status = 'failed';
-- Generic plan (without values)
PREPARE test AS SELECT * FROM orders WHERE status = $1;
EXPLAIN (ANALYZE, BUFFERS, GENERIC_PLAN) -- PostgreSQL 16+
EXECUTE test('failed');
Benchmark: Custom vs Generic
| Query | Custom Plan | Generic Plan | Slowdown |
|---|---|---|---|
| status=‘failed’ (0.01%) | 2ms | 890ms | 445× |
| status=‘rare’ (1%) | 45ms | 920ms | 20× |
| status=‘common’ (99%) | 1200ms | 1150ms | 0.96× |
Checklist
## Prepared Statement Diagnostics
### Identification
- [ ] Do you have skewed data distribution?
- [ ] Using pgBouncer in transaction mode?
- [ ] Queries have high stddev in pg_stat_statements?
### Quick Fixes
- [ ] Set plan_cache_mode = 'force_custom_plan' for session
- [ ] Disable prepared statements in connection pool
- [ ] Use partial indexes for rare values
### Long-term Solution
- [ ] Implement application-level query routing
- [ ] Monitor coefficient of variation in pg_stat_statements
- [ ] Regular ANALYZE after bulk operations
Conclusion
This is one of those performance issues that seems impossible until you understand the mechanism. The query is optimal in staging, slow in production, and you can’t find any difference between environments. The answer is that production has been running long enough for prepared statements to hit the 5-execution threshold and switch to generic plans.
The fundamental issue is that PostgreSQL’s generic plan optimization makes the wrong trade-off when data is heavily skewed. It assumes the cost of re-planning each execution outweighs the cost of a suboptimal plan. But with 1000x selectivity differences between parameter values, the suboptimal plan is catastrophically expensive.
The irony is that prepared statements are supposed to make queries faster. And they do, for uniformly distributed data. But for skewed data, they make rare-value queries dramatically slower while barely improving common-value queries.
Key principles:
- Generic plan triggers after 5 executions - fresh connections won’t show the problem
- Generic plan uses average statistics - terrible for skewed distributions
- With 0.01% vs 99% skew - rare value queries can be 100-500x slower
- pgBouncer transaction mode adds complexity by breaking prepared statement persistence
- Coefficient of variation in pg_stat_statements reveals problematic queries
Solution strategies: force custom plans for critical queries, use partial indexes for rare values, or route queries at the application level to avoid parameters for skewed columns.
Related Articles
- PostgreSQL Autovacuum SLO - Vacuum tuning
- Connection Pool Sizing - Proper pool dimensioning
Related posts
Kubernetes CPU Throttling Autopsy: Why p99 Latency Explodes at 40% CPU Usage
CPU looks OK but tail latency is catastrophic. I'll show how to correlate CFS throttling with latency spikes and why removing CPU limits can paradoxically help.
Connection Pool Sizing with Little's Law: Mathematical Approach to HikariCP and PgBouncer
Pool size 50 because that's how it's always been? I'll show how to use Little's Law to calculate optimal pool size and prove it with load tests.
Redlock vs PostgreSQL Advisory Locks: When You Don't Need Redis for Distributed Locking
Adding Redis just for distributed locks? PostgreSQL advisory locks might be enough. I compare both with failure scenarios and performance benchmarks.
EXPLAIN Lied to You: The PostgreSQL Prepared Statement Plan Cliff
Your EXPLAIN looks perfect but production melts. The culprit: PostgreSQL silently switched from a custom plan to a generic plan after enough executions, and the generic plan is catastrophically wrong.
Cite this article
If you reference this post, please link to the original URL and credit the author.