Back to blog

PostgreSQL 'cached plan must not change result type' During Zero-Downtime Migrations

|
| postgresql, debugging, migrations, jdbc, zero-downtime

Cached plans bit us right after a schema change. “Deploy went fine, then 5 minutes later: ERROR: cached plan must not change result type.” The migration had completed successfully. The new application code was deployed. Everything looked perfect. Then, seemingly at random, connections started failing with this cryptic error. Some requests worked fine; others crashed. The pattern made no sense.

The cause was one of those subtle interactions between PostgreSQL and JDBC that nobody warns you about: server-side prepared statements. When JDBC executes a query enough times, it promotes it to a server-side prepared statement, caching the query plan on the PostgreSQL server. This is great for performance—no more parsing and planning the same query over and over. But when you change the schema, those cached plans become invalid. The plan says “this table has 14 columns” while the table now has 15.

What makes this especially insidious is the delay. The error doesn’t happen immediately after the schema change. It happens minutes or hours later, when a connection that was idle during the migration gets reused. That connection still has the old cached plan, and boom—your application explodes.

Environment: PostgreSQL 12+, JDBC with HikariCP, server-side prepared statements (prepareThreshold), zero-downtime DDL migrations

The Problem

The Delayed Explosion

The timeline of a typical incident looks like this:

Migration timeline:

T+0:00   ALTER TABLE orders ADD COLUMN discount_code VARCHAR(50);
         Migration succeeds, no errors

T+0:30   Rolling deploy starts, new app version uses discount_code

T+2:00   Old app instances still running, cached prepared statements

T+5:00   Connection returns to pool, reused by request
         ERROR: cached plan must not change result type

T+5:01   Exceptions cascade across connection pool
         Application partially down despite "successful" migration

The frustrating part is that everything seems to work initially. The migration succeeds. The first few requests work. Then, seemingly at random, errors start appearing. Not on every request—just some. The pattern correlates with which database connections happen to have cached plans from before the schema change.

The Error

The error message is PostgreSQL telling you that it’s confused:

-- Application runs this query
SELECT * FROM orders WHERE id = $1;

-- PostgreSQL returns
ERROR:  cached plan must not change result type
DETAIL:  Expected type character varying for column 15, but got nothing.

-- Or worse, silent wrong data if columns reordered!

PostgreSQL cached a plan that expected 14 columns. Now the table has 15. The cached plan’s output format doesn’t match reality. Rather than return potentially wrong data, PostgreSQL throws an error.

This is actually the good case. The dangerous case is when columns are reordered or types change in compatible ways. PostgreSQL might not notice the mismatch and return data in the wrong order. Your application reads column 3 thinking it’s the email, but it’s actually the phone number. Data integrity silently corrupts.

Root Cause

Server-Side Prepared Statements

To understand the problem, you need to understand how JDBC interacts with PostgreSQL prepared statements:

How JDBC prepared statements work with PostgreSQL:

First N executions (N = prepareThreshold, default 5):
┌─────────────────────────────────────────────────────────────┐
│ Client: SELECT * FROM orders WHERE id = ?                   │
│ Server: Parse → Plan → Execute → Return                     │
│         (plan created fresh each time)                      │
└─────────────────────────────────────────────────────────────┘

After threshold crossed:
┌─────────────────────────────────────────────────────────────┐
│ Client: PREPARE stmt AS SELECT * FROM orders WHERE id = $1  │
│ Server: Stores parsed statement + cached plan               │
│                                                             │
│ Subsequent calls:                                           │
│ Client: EXECUTE stmt(123)                                   │
│ Server: Uses CACHED plan (no re-parse, no re-plan!)        │
└─────────────────────────────────────────────────────────────┘

Schema change happens:
┌─────────────────────────────────────────────────────────────┐
│ ALTER TABLE orders ADD COLUMN discount_code VARCHAR(50);    │
│                                                             │
│ Cached plan: expects 14 columns                            │
│ Actual table: now has 15 columns                           │
│                                                             │
│ Next EXECUTE → cached plan must not change result type!    │
└─────────────────────────────────────────────────────────────┘

The first five times you execute a query, JDBC sends it as a simple query and PostgreSQL parses and plans it fresh. This is safe—any schema changes are immediately reflected.

After the fifth execution, JDBC decides this query is “hot” and promotes it to a server-side prepared statement. It sends a PREPARE command to PostgreSQL, which parses the query and caches the plan. Future executions just run the cached plan—no parsing, no planning, much faster.

The problem is that the cached plan encodes assumptions about the table structure. When you change the schema, those assumptions become false, but the cached plan doesn’t know. It still expects the old column count, the old column types, the old column order.

Why It’s Delayed

Connection pools make this worse by keeping connections alive between requests:

// Connection pool lifecycle makes this worse

// T+0: Connection C1 prepares statement (5+ uses)
// T+1: C1 returns to pool
// T+2: Schema changes
// T+3: C1 still in pool with stale cached plan
// T+5: C1 checked out for new request → BOOM

// The delay depends on:
// - Connection pool idle timeout
// - Connection max lifetime
// - Traffic patterns
// - When threshold was crossed

A connection that’s been idle in the pool doesn’t know the schema changed. When it gets checked out for a new request, it tries to execute its cached plan against the new schema. Only then does the error appear.

The randomness of which connection gets checked out makes the error seem sporadic. One request works (fresh connection), the next fails (connection with stale plan), the next works (different connection). This makes debugging extremely frustrating until you understand the underlying cause.

Diagnosis

Identify Problematic Queries

You can see which prepared statements exist on a connection:

-- Find prepared statements on current connection
SELECT name, statement, prepare_time
FROM pg_prepared_statements;

-- Check if catalog changed since prepare
SELECT relname, relfilenode
FROM pg_class
WHERE relname = 'orders';

-- After schema change, relfilenode changes
-- But cached plans still reference old structure

The pg_prepared_statements view shows what’s cached on the current connection. If you see statements that were prepared before your schema change, those are your culprits.

Check JDBC Configuration

Understanding your application’s JDBC settings helps predict behavior:

// Look for these settings in your datasource
Properties props = new Properties();
props.getProperty("prepareThreshold");  // Default: 5
props.getProperty("preparedStatementCacheQueries");  // Default: 256
props.getProperty("preparedStatementCacheSizeMiB");  // Default: 5

// If prepareThreshold > 0, you're using server-side prepared statements

The prepareThreshold setting controls how many executions before a statement is promoted to server-side. Lower values mean more caching (and more potential for stale plans). Higher values reduce caching but also reduce the performance benefit.

Find the Offending Queries

SELECT * queries are the most vulnerable because they break on any column change:

-- In pg_stat_statements, look for SELECT * patterns
SELECT query, calls, mean_exec_time
FROM pg_stat_statements
WHERE query LIKE 'SELECT *%'
ORDER BY calls DESC;

-- These are most likely to break on schema changes

Any query that uses SELECT * will break when columns are added, removed, or reordered. Queries that explicitly list columns are safer—they only break if those specific columns change.

The Fix

Option 1: Avoid SELECT * in Application Code

The safest long-term fix is eliminating SELECT * from your codebase:

// DON'T do this - breaks on any column change
@Query("SELECT * FROM orders WHERE id = :id")
Order findById(Long id);

// DO this instead - explicit columns
@Query("SELECT id, customer_id, total, created_at FROM orders WHERE id = :id")
Order findById(Long id);

// Or use column mapping that handles extras gracefully
@Entity
@Table(name = "orders")
public class Order {
    // JPA/Hibernate maps only declared fields
    // Extra columns are ignored
}

With explicit column lists, adding a new column doesn’t affect existing queries. Only queries that need the new column need updating. This makes schema evolution much safer.

JPA/Hibernate entity mapping is also safe—it only fetches columns that map to declared fields. Extra columns in the table are ignored. This is why ORM-based applications often survive schema changes better than raw SQL applications.

Option 2: Connection Pool Recycling on Deploy

If you can’t eliminate SELECT *, ensure the connection pool refreshes after schema changes:

# HikariCP - force connection refresh
spring:
  datasource:
    hikari:
      max-lifetime: 1800000        # 30 minutes max
      idle-timeout: 600000         # 10 minutes idle
      connection-timeout: 30000

      # During migrations, reduce these temporarily:
      # max-lifetime: 60000        # 1 minute - force quick refresh

Shorter max-lifetime values force connections to be recycled more frequently. After a schema change, connections with stale plans are eventually replaced. The trade-off is more connection churn during normal operation.

For more immediate recycling, you can programmatically evict connections:

// Programmatic pool refresh after migration
@EventListener(MigrationCompletedEvent.class)
public void refreshPool() {
    HikariDataSource ds = (HikariDataSource) dataSource;
    HikariPoolMXBean pool = ds.getHikariPoolMXBean();

    // Soft evict all connections (finish current work, then close)
    pool.softEvictConnections();

    log.info("Evicted {} connections after migration",
             pool.getTotalConnections());
}

The softEvictConnections() method marks all connections for eviction after their current work completes. New connections are created without stale plans.

Option 3: Disable Server-Side Prepared Statements

The nuclear option is disabling server-side prepared statements entirely:

# application.properties - nuclear option
spring.datasource.hikari.data-source-properties.prepareThreshold=0

# Or per-connection
spring.datasource.url=jdbc:postgresql://host/db?prepareThreshold=0

With prepareThreshold=0, JDBC never promotes statements to server-side. Every execution parses and plans fresh. This eliminates the cached plan problem entirely at the cost of some performance.

// Trade-off: Slight performance hit for safety
// Each query re-parses on server
// Usually acceptable for OLTP workloads

For most OLTP applications, the performance impact is negligible. Parse time is typically microseconds, dwarfed by actual query execution time. The safety benefit of never having stale plans often outweighs the performance cost.

Option 4: Use DEALLOCATE After Schema Changes

You can explicitly invalidate prepared statements without restarting the application:

-- Run after migration, before traffic resumes
-- Invalidates all prepared statements on current connection
DEALLOCATE ALL;

-- Or target specific statement
DEALLOCATE stmt_name;

The challenge is that this only affects the current connection. You’d need to run it on every connection in the pool:

// In migration script or post-migration hook
@Transactional
public void invalidatePreparedStatements() {
    entityManager.createNativeQuery("DEALLOCATE ALL").executeUpdate();
}

This is tricky to implement correctly because you need to ensure every pooled connection executes the DEALLOCATE. Soft-evicting the pool is usually simpler.

Option 5: Phased Migration Approach

The safest approach is phasing your schema changes:

-- Phase 1: Add column as nullable (safe)
ALTER TABLE orders ADD COLUMN discount_code VARCHAR(50);

-- Phase 2: Deploy new code that handles NULL discount_code
-- Wait for connection pool to cycle (max-lifetime)

-- Phase 3: Backfill data
UPDATE orders SET discount_code = 'NONE' WHERE discount_code IS NULL;

-- Phase 4: Add NOT NULL constraint (if needed)
ALTER TABLE orders ALTER COLUMN discount_code SET NOT NULL;

By adding columns as nullable first, existing code continues working even with stale plans. The SELECT * returns NULL for the new column, which the application ignores. Only after the connection pool has cycled (and new code is deployed) do you make the column required.

Monitoring

Set up alerts to catch cached plan errors before they cascade:

groups:
  - name: postgresql-prepared-statements
    rules:
      - alert: CachedPlanErrors
        expr: |
          rate(postgresql_errors_total{error="cached_plan_must_not_change"}[5m]) > 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Cached plan errors after schema change"

      - alert: PreparedStatementCount
        expr: |
          pg_prepared_statements_count > 1000
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High prepared statement count - memory pressure risk"

The first alert catches the actual error. The second catches excessive prepared statement accumulation, which can indicate a leak or misconfiguration.

Checklist

## PostgreSQL Cached Plan Schema Change

### Before Migration
- [ ] Review queries for SELECT * patterns
- [ ] Check prepareThreshold setting
- [ ] Plan connection pool refresh strategy
- [ ] Test migration in staging with traffic

### During Migration
- [ ] Run ALTER TABLE during low traffic
- [ ] Monitor for cached plan errors
- [ ] Be ready to soft-evict connections

### After Migration
- [ ] Verify no cached plan errors in logs
- [ ] Confirm connection pool has cycled
- [ ] Monitor query performance (re-planning overhead)

### Long-term Fixes
- [ ] Eliminate SELECT * from application code
- [ ] Consider prepareThreshold=0 for safer deploys
- [ ] Add migration hooks for DEALLOCATE ALL

Conclusion

Server-side prepared statements are a performance optimization that becomes a liability during schema changes. The cached query plan doesn’t know your table structure changed, and the delay between schema change and error makes debugging difficult.

Key principles to remember:

  1. **Avoid SELECT *** - explicit columns survive schema changes gracefully
  2. Cycle connections after DDL - soft-evict the pool or reduce max-lifetime
  3. Consider prepareThreshold=0 - the safety of never having stale plans often outweighs the performance cost
  4. Phase migrations - add nullable columns first, wait for connection refresh, then tighten constraints

The next time you plan a schema migration, consider not just the migration itself, but the prepared statements that might be cached in your connection pool.


Related posts

Cite this article

If you reference this post, please link to the original URL and credit the author.

Michal Drozd. "PostgreSQL 'cached plan must not change result type' During Zero-Downtime Migrations". https://www.michal-drozd.com/en/blog/postgresql-cached-plan-schema-change/ (Published December 11, 2024).