Back to blog

PostgreSQL TOAST Strategy: Why Your JSON Column Kills Query Performance

|
| postgresql, toast, performance, json, optimization, storage

TOAST was invisible to me until storage exploded. “Our API response is slow even though the query takes 5ms.” We were profiling an API endpoint that was supposed to be fast—a simple query fetching a list of orders. The database query showed 5ms execution time. But the API response took 850ms. Where was the missing 845ms?

The answer was TOAST. Each order had a JSONB column storing the full order payload—line items, shipping details, customer preferences—averaging 50KB per document. When we ran SELECT *, PostgreSQL dutifully fetched every one of those 50KB documents from TOAST storage, decompressed them, and sent them to the application. For 100 orders, that was 5MB of JSONB data being read from disk, decompressed, and transferred—even though our API only needed the order ID and status.

TOAST (The Oversized-Attribute Storage Technique) is PostgreSQL’s solution to a fundamental problem: how do you store values larger than a page (8KB) when pages are your basic unit of storage? The answer is to store large values separately, in a special TOAST table, and keep only a pointer in the main row. This works great for storage efficiency but creates a performance trap: every time you read a row, PostgreSQL must make additional I/O to fetch the TOAST data.

The insidious part is that the query planner time doesn’t reflect TOAST fetches. Your EXPLAIN ANALYZE shows 5ms because that’s how long the index lookup and heap access took. The TOAST fetches happen during result streaming and don’t appear in execution time. Your query looks fast, but your actual data transfer is slow.

Tested on: PostgreSQL 16.1, tables with 100K+ rows, JSONB columns 10-100KB

How TOAST Works

Storage Threshold

Default page size: 8KB
TOAST threshold: ~2KB per column value

If value > 2KB:
1. Compress value (if compressible)
2. If still > 2KB: store in separate TOAST table
3. Main table stores pointer to TOAST data

TOAST Table Structure

-- Main table
CREATE TABLE orders (
    id bigserial PRIMARY KEY,
    customer_id int,
    order_data jsonb  -- Large JSON stored in TOAST
);

-- PostgreSQL automatically creates:
-- pg_toast.pg_toast_<oid> - TOAST data table
-- pg_toast.pg_toast_<oid>_index - TOAST index

-- Check TOAST table
SELECT relname, reltoastrelid::regclass
FROM pg_class
WHERE relname = 'orders';

TOAST Strategies

-- EXTENDED (default for varlena types)
-- Allows compression and out-of-line storage
ALTER TABLE orders ALTER COLUMN order_data SET STORAGE EXTENDED;

-- EXTERNAL
-- No compression, out-of-line storage
-- Faster for pre-compressed data (already compressed JSON)
ALTER TABLE orders ALTER COLUMN order_data SET STORAGE EXTERNAL;

-- MAIN
-- Compression, avoid out-of-line if possible
-- Good for data you always need
ALTER TABLE orders ALTER COLUMN order_data SET STORAGE MAIN;

-- PLAIN
-- No compression, no out-of-line (will error if too large)
-- For data that must fit in main tuple
ALTER TABLE orders ALTER COLUMN small_data SET STORAGE PLAIN;

The Performance Problem

SELECT * with TOAST

-- Each row fetches from TOAST table
SELECT * FROM orders WHERE customer_id = 123;

-- Behind the scenes:
-- 1. Find row in main table: 0.1ms
-- 2. Fetch TOAST data for order_data: 2-10ms per row!
-- 3. Decompress if needed: 0.5ms

-- 100 rows × 5ms = 500ms just for TOAST fetches!

Benchmark: TOAST Impact

-- Create test table
CREATE TABLE orders_test (
    id serial PRIMARY KEY,
    customer_id int,
    status text,
    order_data jsonb  -- 50KB average
);

-- Insert 100K rows with large JSON
INSERT INTO orders_test (customer_id, status, order_data)
SELECT
    random() * 1000,
    'active',
    jsonb_build_object(
        'items', (SELECT jsonb_agg(jsonb_build_object(
            'product_id', i,
            'name', repeat('x', 100),
            'price', random() * 100
        )) FROM generate_series(1, 100) i)
    )
FROM generate_series(1, 100000);

-- Query with TOAST fetch
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders_test WHERE customer_id = 42;
-- Execution Time: 850ms
-- Buffers: shared hit=12000 (lots of TOAST reads)

-- Query without TOAST column
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, customer_id, status FROM orders_test WHERE customer_id = 42;
-- Execution Time: 8ms
-- Buffers: shared hit=120

Solutions

1. Don’t SELECT *

-- BAD: Fetches all TOAST data
SELECT * FROM orders WHERE customer_id = 123;

-- GOOD: Only fetch what you need
SELECT id, customer_id, status, created_at
FROM orders WHERE customer_id = 123;

-- Fetch large column only when needed
SELECT order_data FROM orders WHERE id = 456;

2. Use EXTERNAL for Pre-compressed Data

-- If storing already compressed data (gzipped JSON, images)
ALTER TABLE documents
ALTER COLUMN compressed_data SET STORAGE EXTERNAL;

-- Skips PostgreSQL compression (faster insert/read)
-- Still stores out-of-line if large

3. Consider Separate Table

-- Instead of one wide table
CREATE TABLE orders (
    id bigserial PRIMARY KEY,
    customer_id int,
    status text,
    order_data jsonb  -- Large!
);

-- Split into two tables
CREATE TABLE orders (
    id bigserial PRIMARY KEY,
    customer_id int,
    status text
);

CREATE TABLE order_details (
    order_id bigint PRIMARY KEY REFERENCES orders(id),
    order_data jsonb
);

-- Common queries don't touch large data
SELECT id, customer_id, status FROM orders WHERE customer_id = 123;

-- Large data only when needed
SELECT o.*, od.order_data
FROM orders o
JOIN order_details od ON od.order_id = o.id
WHERE o.id = 456;

4. Use JSONB Partial Indexing

-- Index specific JSON paths
CREATE INDEX idx_orders_status ON orders ((order_data->>'status'));

-- Query uses index, doesn't need full TOAST fetch
SELECT id FROM orders WHERE order_data->>'status' = 'shipped';

5. Normalize Hot Fields

-- Instead of querying JSON for common filters
SELECT * FROM orders WHERE order_data->>'status' = 'active';

-- Add regular column for hot fields
ALTER TABLE orders ADD COLUMN status text;
UPDATE orders SET status = order_data->>'status';
CREATE INDEX idx_orders_status ON orders(status);

-- Now queries are fast
SELECT id, status FROM orders WHERE status = 'active';

Monitoring TOAST

Check TOAST Size

-- TOAST table size per table
SELECT
    c.relname AS table_name,
    pg_size_pretty(pg_relation_size(c.oid)) AS table_size,
    pg_size_pretty(pg_relation_size(c.reltoastrelid)) AS toast_size,
    pg_size_pretty(pg_total_relation_size(c.oid)) AS total_size
FROM pg_class c
WHERE c.relkind = 'r'
AND c.reltoastrelid != 0
ORDER BY pg_relation_size(c.reltoastrelid) DESC
LIMIT 10;

TOAST Statistics

-- TOAST access statistics
SELECT
    schemaname,
    relname,
    seq_scan,
    seq_tup_read,
    idx_scan,
    n_tup_ins,
    n_tup_upd
FROM pg_stat_all_tables
WHERE relname LIKE 'pg_toast_%'
ORDER BY seq_tup_read DESC
LIMIT 10;

Column Storage Strategy

-- Check current storage strategy
SELECT
    a.attname,
    t.typname,
    CASE a.attstorage
        WHEN 'p' THEN 'PLAIN'
        WHEN 'e' THEN 'EXTERNAL'
        WHEN 'm' THEN 'MAIN'
        WHEN 'x' THEN 'EXTENDED'
    END AS storage
FROM pg_attribute a
JOIN pg_type t ON a.atttypid = t.oid
WHERE a.attrelid = 'orders'::regclass
AND a.attnum > 0;

TOAST and Indexes

Index-Only Scans Blocked

-- This won't use index-only scan if order_data is TOAST'd
CREATE INDEX idx_orders_customer ON orders(customer_id) INCLUDE (order_data);

-- PostgreSQL can't return TOAST'd data from index alone
-- Must visit heap for detoasting

Covering Index for Small Columns

-- Works for non-TOAST columns
CREATE INDEX idx_orders_customer_covering
ON orders(customer_id)
INCLUDE (status, created_at);  -- Small columns only

-- Can use index-only scan
EXPLAIN SELECT status, created_at FROM orders WHERE customer_id = 123;
-- -> Index Only Scan

TOAST Compression

Check Compression Ratio

-- Actual vs stored size
SELECT
    id,
    pg_column_size(order_data) AS stored_size,
    length(order_data::text) AS raw_size,
    round(100.0 * pg_column_size(order_data) / length(order_data::text), 1) AS compression_pct
FROM orders
LIMIT 10;

When to Disable Compression

-- Already compressed data (images, pre-compressed JSON)
ALTER TABLE media ALTER COLUMN image_data SET STORAGE EXTERNAL;

-- Random/encrypted data (won't compress anyway)
ALTER TABLE secrets ALTER COLUMN encrypted_data SET STORAGE EXTERNAL;

Checklist

## TOAST Optimization

### Analysis
- [ ] Identify tables with large TOAST sizes
- [ ] Find queries doing SELECT * on TOAST tables
- [ ] Check TOAST access frequency in pg_stat_all_tables

### Query Optimization
- [ ] Replace SELECT * with specific columns
- [ ] Only fetch large columns when needed
- [ ] Add regular columns for hot JSON fields

### Storage Strategy
- [ ] EXTENDED: compressible data (default)
- [ ] EXTERNAL: pre-compressed data
- [ ] MAIN: always-needed medium data
- [ ] Consider table splitting for very large columns

### Monitoring
- [ ] Track TOAST table sizes
- [ ] Monitor TOAST sequential scans
- [ ] Alert on excessive TOAST growth

Conclusion

TOAST is one of those PostgreSQL internals that works perfectly until it doesn’t. For tables with small rows, you never think about it—large values are transparently stored out-of-line and transparently fetched when needed. But as your JSONB columns grow, that transparency becomes a performance liability. The query looks fast because the planner doesn’t account for TOAST fetches; the actual response is slow because of all the hidden I/O.

The fundamental fix is simple: don’t fetch data you don’t need. Replace SELECT * with explicit column lists that exclude large TOAST’d columns unless you actually need them. This single change can be the difference between a 5ms response and an 850ms response—a 170× improvement for zero application logic changes.

When you do need the large data, consider architectural changes: split the large columns into a separate table that’s joined only when needed, extract hot fields from JSON into regular columns, or use EXTERNAL storage for pre-compressed data to skip PostgreSQL’s compression overhead.

Key principles:

  1. **Never SELECT *** on tables with large columns—TOAST fetches are hidden but costly
  2. Use EXTERNAL storage for pre-compressed data to skip redundant compression
  3. Split tables when large columns are rarely accessed in the hot path
  4. Normalize hot fields out of JSON so you can query efficiently without TOAST
  5. Monitor TOAST table sizes—they reveal where your hidden I/O costs live

Your 50KB JSON column in a listing query is your problem. The query planner won’t tell you; only the end-to-end response time will.


Related posts

Cite this article

If you reference this post, please link to the original URL and credit the author.

Michal Drozd. "PostgreSQL TOAST Strategy: Why Your JSON Column Kills Query Performance". https://www.michal-drozd.com/en/blog/postgresql-toast-optimization/ (Published June 24, 2025).