Back to blog

When Prepared Statements Make PostgreSQL 10× Slower: Generic Plan Trap

|
| postgresql, performance, prepared-statements, pgbouncer, java, go

Prepared statements are great until they lock you into a bad plan. “Same query, same parameters, but production takes 800ms and staging 15ms.” I heard this from a colleague who spent two days debugging. The staging database was a fresh copy of production. The query was identical. The parameters were identical. But production was consistently 50x slower.

The culprit was PostgreSQL’s generic plan cache—a performance optimization that, with the wrong data distribution, becomes a performance catastrophe. After executing a prepared statement five times, PostgreSQL decides it’s not worth re-planning for each parameter value and switches to a “generic” plan. This generic plan is based on average statistics, not the specific values you’re querying for.

This optimization works beautifully when data is uniformly distributed. If every value of status has roughly the same selectivity, the generic plan is optimal. But when data is skewed—99% completed, 1% pending, 0.01% failed—the generic plan picks a strategy that’s average, which is terrible for the rare values you’re actually querying.

The reason staging worked was timing: my colleague ran his test queries a few times, and the statement never hit the 5-execution threshold that triggers the generic plan switch. Production had been running for weeks, with the problematic query stuck in its slow generic plan forever.

Tested on: PostgreSQL 16.1, pgBouncer 1.21, HikariCP 5.1, Go pgx 5.5

What is Generic vs Custom Plan

PostgreSQL has two types of execution plans for prepared statements, and understanding the difference is key to diagnosing this issue.

When you execute a prepared statement, PostgreSQL has a choice: it can create a new plan for the specific parameter values you provided (custom plan), or it can reuse a cached plan that was created without knowing the specific values (generic plan).

Custom plans are slower to create because the planner runs fresh each time, but they’re optimal because the planner knows exactly what you’re querying for. Generic plans are faster because there’s no planning overhead, but they may be suboptimal because the planner had to guess based on average statistics.

Custom Plan (slower planning, optimal plan)

-- Each call: planner sees concrete values
PREPARE get_user AS SELECT * FROM users WHERE status = $1;
EXECUTE get_user('active');  -- Planner knows: status = 'active', 95% of rows
EXECUTE get_user('banned');  -- Planner knows: status = 'banned', 0.1% of rows

Generic Plan (faster planning, possibly suboptimal)

-- After 5 executions: PostgreSQL switches to generic plan
-- Planner DOESN'T see concrete values, uses statistics
EXECUTE get_user('active');  -- Uses average estimate
EXECUTE get_user('banned');  -- Uses THE SAME plan!

Problem: Skewed Data Distribution

-- Table with 10M rows
CREATE TABLE orders (
    id bigserial PRIMARY KEY,
    status varchar(20),
    customer_id bigint,
    created_at timestamp
);

-- Status distribution (skewed!)
-- 'completed': 9,500,000 (95%)
-- 'pending':     450,000 (4.5%)
-- 'processing':   49,000 (0.49%)
-- 'failed':        1,000 (0.01%)

CREATE INDEX idx_orders_status ON orders(status);

Custom Plan for ‘failed’ (0.01%)

EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'failed';
-- Index Scan using idx_orders_status
-- Rows: 1,000
-- Time: 2ms

Generic Plan (average estimate)

-- Generic plan estimates: status has 4 distinct values
-- Average estimate: 10M / 4 = 2.5M rows
-- With 2.5M rows, Seq Scan is "better" than Index Scan

EXPLAIN ANALYZE SELECT * FROM orders WHERE status = $1;
-- Seq Scan on orders
-- Rows: 2,500,000 (estimate) vs 1,000 (reality)
-- Time: 2500ms vs 2ms

Reproducible Lab

Setup

-- test_prepared_statements.sql
CREATE TABLE test_skewed (
    id bigserial PRIMARY KEY,
    category varchar(10),
    data text
);

-- Skewed distribution
INSERT INTO test_skewed (category, data)
SELECT
    CASE
        WHEN random() < 0.99 THEN 'common'
        WHEN random() < 0.999 THEN 'rare'
        ELSE 'unicorn'
    END,
    md5(random()::text)
FROM generate_series(1, 10000000);

CREATE INDEX idx_category ON test_skewed(category);
ANALYZE test_skewed;

-- Verify distribution
SELECT category, count(*), round(100.0 * count(*) / sum(count(*)) over(), 2) as pct
FROM test_skewed
GROUP BY category;
-- common:  9,900,000 (99%)
-- rare:       99,000 (0.99%)
-- unicorn:     1,000 (0.01%)

Test Script (Python)

# test_generic_plan.py
import psycopg2
import time

conn = psycopg2.connect("postgresql://localhost/testdb")
cur = conn.cursor()

# Prepared statement
cur.execute("PREPARE test_stmt AS SELECT * FROM test_skewed WHERE category = $1")

results = []

for i in range(20):
    for category in ['unicorn', 'common', 'rare']:
        start = time.time()
        cur.execute("EXECUTE test_stmt(%s)", (category,))
        rows = cur.fetchall()
        elapsed = (time.time() - start) * 1000

        results.append({
            'iteration': i,
            'category': category,
            'rows': len(rows),
            'time_ms': round(elapsed, 2)
        })

        # After 5 iterations switches to generic plan
        if i == 5:
            print(f"--- GENERIC PLAN KICKS IN ---")

# Output
for r in results:
    print(f"Iter {r['iteration']:2d} | {r['category']:8s} | {r['rows']:8d} rows | {r['time_ms']:8.2f}ms")

Results

Iter  0 | unicorn  |     1000 rows |     2.34ms  # custom plan
Iter  1 | unicorn  |     1000 rows |     1.89ms
...
Iter  4 | unicorn  |     1000 rows |     2.01ms
--- GENERIC PLAN KICKS IN ---
Iter  5 | unicorn  |     1000 rows |   890.45ms  # 400× slower!
Iter  6 | unicorn  |     1000 rows |   912.23ms

pgBouncer Complication

Transaction Pooling Mode

# pgbouncer.ini
[pgbouncer]
pool_mode = transaction  # Problem!

In transaction mode:

  1. Each transaction can go to a different connection
  2. Prepared statements are per-connection
  3. Statement prepared on conn A doesn’t exist on conn B
  4. Some drivers “simulate” prepared statements

Java HikariCP + pgBouncer

// Problem: HikariCP uses prepared statements by default
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://pgbouncer:6432/mydb");

// Fix 1: Disable server-side prepared statements
config.addDataSourceProperty("prepareThreshold", "0");

// Fix 2: Or use JDBC parameter
// jdbc:postgresql://host/db?prepareThreshold=0

Go pgx

// pgx uses prepared statements by default
config, _ := pgxpool.ParseConfig(connString)

// Fix: Disable prepared statement cache
config.ConnConfig.DefaultQueryExecMode = pgx.QueryExecModeSimpleProtocol

// Or per-query
rows, err := pool.Query(ctx, sql, pgx.QueryExecModeSimpleProtocol, args...)

Solutions

1. Force Custom Plan (PostgreSQL 12+)

-- Session level
SET plan_cache_mode = 'force_custom_plan';

-- Or per-query (PostgreSQL 16+)
SELECT /*+ CustomPlan */ * FROM orders WHERE status = $1;

2. Increase Threshold Before Generic Plan

-- Default is 5, increase to higher value
-- -1 = never use generic plan
ALTER SYSTEM SET plan_cache_mode = 'force_custom_plan';

3. Partial Indexes for Skewed Data

-- Instead of single index
CREATE INDEX idx_orders_status ON orders(status);

-- Create partial indexes for rare values
CREATE INDEX idx_orders_failed ON orders(id) WHERE status = 'failed';
CREATE INDEX idx_orders_processing ON orders(id) WHERE status = 'processing';

-- Generic plan now uses correct index

4. Application-Level Routing

// For critical queries use explicit SQL
public List<Order> getFailedOrders() {
    // This query always uses optimal plan
    return jdbcTemplate.query(
        "SELECT * FROM orders WHERE status = 'failed'",  // no parameter!
        orderRowMapper
    );
}

// For common queries parameter is OK
public List<Order> getOrdersByStatus(String status) {
    return jdbcTemplate.query(
        "SELECT * FROM orders WHERE status = ?",
        orderRowMapper,
        status
    );
}

Diagnostics

Check if Query Uses Generic Plan

-- pg_stat_statements
SELECT
    query,
    calls,
    mean_exec_time,
    stddev_exec_time,
    -- High stddev = possible generic plan problem
    stddev_exec_time / mean_exec_time as coefficient_of_variation
FROM pg_stat_statements
WHERE query LIKE '%orders%'
ORDER BY coefficient_of_variation DESC;

EXPLAIN with Concrete vs Generic Parameters

-- Custom plan (with values)
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE status = 'failed';

-- Generic plan (without values)
PREPARE test AS SELECT * FROM orders WHERE status = $1;
EXPLAIN (ANALYZE, BUFFERS, GENERIC_PLAN)  -- PostgreSQL 16+
EXECUTE test('failed');

Benchmark: Custom vs Generic

QueryCustom PlanGeneric PlanSlowdown
status=‘failed’ (0.01%)2ms890ms445×
status=‘rare’ (1%)45ms920ms20×
status=‘common’ (99%)1200ms1150ms0.96×

Checklist

## Prepared Statement Diagnostics

### Identification
- [ ] Do you have skewed data distribution?
- [ ] Using pgBouncer in transaction mode?
- [ ] Queries have high stddev in pg_stat_statements?

### Quick Fixes
- [ ] Set plan_cache_mode = 'force_custom_plan' for session
- [ ] Disable prepared statements in connection pool
- [ ] Use partial indexes for rare values

### Long-term Solution
- [ ] Implement application-level query routing
- [ ] Monitor coefficient of variation in pg_stat_statements
- [ ] Regular ANALYZE after bulk operations

Conclusion

This is one of those performance issues that seems impossible until you understand the mechanism. The query is optimal in staging, slow in production, and you can’t find any difference between environments. The answer is that production has been running long enough for prepared statements to hit the 5-execution threshold and switch to generic plans.

The fundamental issue is that PostgreSQL’s generic plan optimization makes the wrong trade-off when data is heavily skewed. It assumes the cost of re-planning each execution outweighs the cost of a suboptimal plan. But with 1000x selectivity differences between parameter values, the suboptimal plan is catastrophically expensive.

The irony is that prepared statements are supposed to make queries faster. And they do, for uniformly distributed data. But for skewed data, they make rare-value queries dramatically slower while barely improving common-value queries.

Key principles:

  1. Generic plan triggers after 5 executions - fresh connections won’t show the problem
  2. Generic plan uses average statistics - terrible for skewed distributions
  3. With 0.01% vs 99% skew - rare value queries can be 100-500x slower
  4. pgBouncer transaction mode adds complexity by breaking prepared statement persistence
  5. Coefficient of variation in pg_stat_statements reveals problematic queries

Solution strategies: force custom plans for critical queries, use partial indexes for rare values, or route queries at the application level to avoid parameters for skewed columns.


Related posts

Cite this article

If you reference this post, please link to the original URL and credit the author.

Michal Drozd. "When Prepared Statements Make PostgreSQL 10× Slower: Generic Plan Trap". https://www.michal-drozd.com/en/blog/postgresql-prepared-statements-trap/ (Published September 15, 2025).