Back to blog

pg_waldump WAL Forensics: Reconstructing What Happened to Your Data

|
| postgresql, debugging, wal, forensics, data-recovery

When backups are gone, WAL becomes your crime scene. “50,000 rows disappeared from the orders table but there’s no record of who did it.” The product manager was panicked. Customer orders were missing, and the support team was getting flooded with complaints. Nobody on the engineering team admitted to running a DELETE. The application logs showed nothing suspicious. The monitoring looked normal.

This is when I learned about pg_waldump—PostgreSQL’s black box recorder. The Write-Ahead Log (WAL) records every single change to the database: every INSERT, UPDATE, DELETE, and transaction commit. If data changed, WAL has a record of it. And pg_waldump can read those records to tell you exactly what happened, when, and which transaction did it.

That investigation taught me something important: WAL isn’t just for replication and crash recovery. It’s forensic evidence. When something goes wrong and nobody remembers what happened, the WAL files contain the truth. But you need to know how to read them.

Environment: PostgreSQL 14+, access to WAL files (archived or live), forensic investigation of data changes

The Problem

The Missing Data Mystery

The symptoms are always alarming. Data that existed yesterday is gone today:

-- Morning: Everything fine
SELECT count(*) FROM orders WHERE created_at > '2024-01-01';
-- count: 150,000

-- Afternoon: Data gone!
SELECT count(*) FROM orders WHERE created_at > '2024-01-01';
-- count: 100,000

-- Nobody admits to running DELETE
-- Application logs show nothing suspicious
-- Need to find out what happened

50,000 rows vanished. The first instinct is to blame a bug in the application or a bad deployment. But the logs are clean. No DELETE queries appear in any application log. If someone ran a DELETE directly in psql, there’s no record of that either—unless you have query logging enabled.

This is where WAL analysis becomes essential. Even if nobody logged the query, the database itself recorded the operation.

What WAL Contains

The Write-Ahead Log is PostgreSQL’s guarantee of durability. Before any change is visible, it’s first written to WAL. This means WAL contains a complete history of every modification:

Write-Ahead Log (WAL) records EVERY change:

┌─────────────────────────────────────────────────────────────┐
│ Transaction 12345 at 2024-01-15 14:23:45                   │
│ - INSERT into orders (id=1001, amount=50.00)               │
│ - UPDATE orders SET status='paid' WHERE id=1001            │
│ - DELETE from orders WHERE id < 1000                       │
│ COMMIT                                                      │
│                                                             │
│ Transaction 12346 at 2024-01-15 14:23:47                   │
│ - INSERT into users (id=500, name='john')                  │
│ COMMIT                                                      │
└─────────────────────────────────────────────────────────────┘

pg_waldump reads these records and shows you the history!

Each WAL record contains the operation type (INSERT, UPDATE, DELETE), the affected table, the page and offset within the table, and—critically—the transaction ID and commit timestamp. This is the forensic evidence you need.

The key insight is that WAL exists regardless of whether you have query logging enabled. Even if you never configured any logging, WAL was being written. The question is only whether the WAL files are still available (not yet recycled or archived).

Using pg_waldump

Basic Usage

pg_waldump is a command-line tool that decodes WAL files into human-readable format:

# Find WAL files containing the time range
ls -la /var/lib/postgresql/14/main/pg_wal/

# Dump a specific WAL file
pg_waldump /var/lib/postgresql/14/main/pg_wal/0000000100000ABC00000042

# Output:
# rmgr: Heap        len (rec/tot):     54/    54, tx:    12345, lsn: ABC/42001234
#   blkref #0: rel 1663/16384/16385 blk 100
#   DELETE off 5

# Filter by relation (table) OID
pg_waldump --relation=1663/16384/16385 /path/to/wal/files

The output looks cryptic at first, but it’s structured. rmgr: Heap tells you this is a heap (table) operation. tx: 12345 is the transaction ID. DELETE off 5 means a DELETE operation on tuple offset 5 within the page. The rel 1663/16384/16385 is the relation identifier—we’ll decode that next.

Find Table’s Relation ID

To filter WAL records for a specific table, you need its relation identifier:

-- Get the relation identifier for your table
SELECT
    pg_relation_filenode('orders') AS relfilenode,
    (SELECT oid FROM pg_database WHERE datname = current_database()) AS dboid,
    (SELECT dattablespace FROM pg_database WHERE datname = current_database()) AS spcoid;

-- Output: relfilenode=16385, dboid=16384, spcoid=1663
-- Format for pg_waldump: spcoid/dboid/relfilenode = 1663/16384/16385

The format is tablespace_oid/database_oid/relation_filenode. For most tables, the tablespace is 1663 (pg_default). The database OID and relation filenode you get from the query above.

With this identifier, you can filter WAL analysis to just your table, ignoring the noise from other tables:

pg_waldump --relation=1663/16384/16385 /var/lib/postgresql/14/main/pg_wal/*

Filter by Time Range

WAL files are named by their position in the log sequence. To focus on a specific time range:

# Convert timestamp to LSN (Log Sequence Number)
psql -c "SELECT pg_current_wal_lsn();"
# Example: ABC/42123456

# Check LSN at specific time from pg_stat_replication or checkpoint records
# Or use the WAL file naming convention:
# 0000000100000ABC00000042 = timeline 1, segment ABC/42000000

# Dump WAL between LSNs
pg_waldump --start=ABC/40000000 --end=ABC/42FFFFFF /path/to/wal/*

# Filter output for specific operations
pg_waldump /path/to/wal/* | grep DELETE

If you have point-in-time recovery configured, you can correlate LSNs with timestamps from your checkpoint history or recovery testing.

Forensic Analysis

Find Large Deletes

The first step in investigation is identifying which transactions made mass changes:

#!/bin/bash
# find_mass_deletes.sh

WAL_DIR="/var/lib/postgresql/14/main/pg_wal"
TARGET_REL="1663/16384/16385"  # Your table's relation

pg_waldump --relation=$TARGET_REL $WAL_DIR/* 2>/dev/null | \
    grep -E "DELETE|MULTI_INSERT|UPDATE" | \
    awk '{
        if (/tx:/) {
            match($0, /tx: *([0-9]+)/, arr);
            tx=arr[1];
            count[tx]++;
        }
    }
    END {
        for (t in count) if (count[t] > 1000)
            print "Transaction " t " had " count[t] " modifications"
    }'

# Output:
# Transaction 12345 had 50234 modifications  <- Found the culprit!

This script counts how many WAL records each transaction generated for your table. A transaction with 50,000+ DELETEs is almost certainly your culprit. Normal application transactions rarely modify more than a few dozen rows at once.

Correlate with Transaction Time

Once you have the transaction ID, find when it happened:

-- If you have pg_stat_statements or query logging
SELECT
    xact_start,
    query,
    usename,
    client_addr
FROM pg_stat_activity
WHERE xact_start > '2024-01-15 14:00:00'
AND xact_start < '2024-01-15 15:00:00';

-- Or check application connection logs
-- Match transaction ID with application session

Note that pg_stat_activity only shows currently running transactions. For historical data, you need either query logging enabled in PostgreSQL or connection logs from your application.

Detailed Transaction Analysis

With the transaction ID, you can see every operation it performed:

# Dump specific transaction
pg_waldump --xid=12345 /path/to/wal/*

# Output shows every operation in that transaction:
# rmgr: Heap        tx: 12345, lsn: ABC/42001234, DELETE off 5
# rmgr: Heap        tx: 12345, lsn: ABC/42001290, DELETE off 6
# rmgr: Heap        tx: 12345, lsn: ABC/42001346, DELETE off 7
# ... 50,000 more DELETE records
# rmgr: Transaction tx: 12345, lsn: ABC/42999999, COMMIT 2024-01-15 14:23:45

The COMMIT record at the end tells you exactly when the transaction committed. This timestamp is your anchor for correlating with other logs.

Reconstruct Deleted Data

If wal_level is set to ‘replica’ or ‘logical’, WAL contains enough information to potentially reconstruct deleted data:

# For each DELETE, WAL contains the old tuple data (if wal_level = replica or logical)

# Check wal_level
psql -c "SHOW wal_level;"
# Must be 'replica' or 'logical' to have full tuple data

# Dump with full records
pg_waldump --bkp-details /path/to/wal/* | grep -A5 "DELETE"

# Parse tuple data (requires understanding of tuple format)
# Or use tools like pg_filedump on the backup blocks

Reconstructing the actual deleted values is complex and usually requires specialized tools or deep understanding of PostgreSQL’s tuple format. For most investigations, knowing what was deleted and when is sufficient.

Advanced Techniques

Find Who Did It

WAL doesn’t directly record the username that ran a query. You need to correlate through other logs:

# WAL doesn't directly record username, but you can correlate:

# 1. Get transaction commit time
pg_waldump --xid=12345 /path/to/wal/* | grep COMMIT
# COMMIT 2024-01-15 14:23:45.123

# 2. Check pg_log for queries at that time
grep "14:23:45" /var/log/postgresql/postgresql-14-main.log

# 3. If using log_statement = 'all' or pgaudit:
grep "DELETE.*orders" /var/log/postgresql/postgresql-14-main.log
# 2024-01-15 14:23:45 user=admin DELETE FROM orders WHERE created_at < '2024-01-01'

This is why query logging (log_statement) or pgaudit is so important. WAL tells you what happened; query logs tell you who did it. Without query logging, you can only narrow down to “a transaction from this IP address at this time.”

Create WAL Analysis Summary

For systematic investigation, create a script that summarizes WAL activity:

#!/bin/bash
# wal_summary.sh - Analyze WAL activity

WAL_FILE=$1

echo "=== WAL File Analysis: $WAL_FILE ==="

echo -e "\n=== Transaction Summary ==="
pg_waldump $WAL_FILE 2>/dev/null | \
    grep "tx:" | \
    sed 's/.*tx: *\([0-9]*\).*/\1/' | \
    sort | uniq -c | sort -rn | head -20

echo -e "\n=== Operations by Type ==="
pg_waldump $WAL_FILE 2>/dev/null | \
    grep "rmgr:" | \
    awk '{print $2}' | sort | uniq -c | sort -rn

echo -e "\n=== Largest Transactions ==="
pg_waldump $WAL_FILE 2>/dev/null | \
    awk '/tx:/ {
        match($0, /tx: *([0-9]+)/, arr);
        match($0, /len \(rec\/tot\): *[0-9]+\/ *([0-9]+)/, len);
        size[arr[1]] += len[1];
    }
    END {
        for (t in size) print size[t], t
    }' | sort -rn | head -10

This script gives you a quick overview: which transactions were most active, what types of operations occurred, and which transactions generated the most WAL data.

Point-in-Time Recovery Preparation

If you need to recover the deleted data, WAL analysis helps you target the recovery:

-- If you need to recover to a point before the bad operation

-- 1. Find the LSN just before the DELETE
pg_waldump /path/to/wal/* | grep "tx: 12345" | head -1
-- lsn: ABC/41999999

-- 2. Use recovery_target_lsn in recovery
-- postgresql.conf on recovery server:
-- restore_command = 'cp /path/to/archive/%f %p'
-- recovery_target_lsn = 'ABC/41999990'
-- recovery_target_action = 'pause'

-- 3. After recovery, extract the data you need
-- Then resume normal operations

The LSN from pg_waldump tells you exactly where to stop recovery to get a database state just before the damaging operation.

Monitoring for Future

Prevent future incidents by monitoring for suspicious operations:

groups:
  - name: postgresql-wal
    rules:
      - alert: LargeDELETEDetected
        expr: |
          rate(pg_stat_statements_rows{query=~".*DELETE.*"}[5m]) > 1000
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Large DELETE operation detected"

      - alert: UnusualWALGeneration
        expr: |
          rate(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0')[5m]) > 100000000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Unusual WAL generation rate"

The WAL generation rate alert catches mass operations in progress—before they complete and before data is irretrievably deleted.

Checklist

## pg_waldump WAL Forensics

### Preparation
- [ ] Ensure wal_level = replica or logical
- [ ] Configure WAL archiving for historical access
- [ ] Enable query logging (log_statement)
- [ ] Consider pgaudit for detailed audit trails

### Investigation Steps
- [ ] Identify affected table's relation ID
- [ ] Find WAL files for relevant time range
- [ ] Filter pg_waldump output for the relation
- [ ] Identify transaction with mass changes
- [ ] Correlate transaction time with query logs
- [ ] Identify user/application that ran the query

### Recovery
- [ ] Determine LSN before bad operation
- [ ] Set up PITR to that LSN
- [ ] Extract needed data from recovered instance
- [ ] Apply data to production if appropriate

Conclusion

WAL is your black box recorder. When something goes wrong and nobody remembers what happened, pg_waldump can reconstruct the truth. The challenge is that you need proper infrastructure configured before the incident:

  1. wal_level = replica or logical - lower levels don’t retain enough data for forensics
  2. WAL archiving - so you have access to historical WAL files
  3. Query logging - to correlate transaction IDs with actual queries and users
  4. pgaudit - for comprehensive audit trails including the user who ran each statement

The next time you set up a PostgreSQL database, configure these from day one. The investigation that takes hours with proper logging is impossible without it.

Key tools to remember:

  • pg_waldump - Read and analyze WAL records
  • log_statement - Track queries in PostgreSQL logs
  • pgaudit - Detailed audit logging with user attribution
  • WAL archiving - Historical access to WAL files for forensics

Related posts

Cite this article

If you reference this post, please link to the original URL and credit the author.

Michal Drozd. "pg_waldump WAL Forensics: Reconstructing What Happened to Your Data". https://www.michal-drozd.com/en/blog/postgresql-wal-forensics/ (Published November 24, 2025).