Back to blog

Transactional Outbox: Solving the Dual Write Problem Without 2PC

The first outbox I built felt like overkill, until it saved a deploy. “Save the order and send an email.” One sentence that caused me three months of debugging. Customers complained they didn’t receive order confirmation, but the order existed in the system. Or the opposite - they got the email, but the order was nowhere to be found.

The problem? Dual Write - two independent systems (database + email/message queue) without a shared transaction. The solution? Transactional Outbox pattern.

Tested on: PostgreSQL 14+, Node.js 20+, TypeScript 5.x. Production-proven on systems handling 50k+ messages daily.

What is the Dual Write Problem

Look at this code:

async function createOrder(order: Order): Promise<void> {
  // Step 1: Save to DB
  await db.orders.insert(order);

  // Step 2: Send event to Kafka/RabbitMQ
  await messageQueue.publish('order.created', order);

  // Step 3: Send notification
  await emailService.send(order.customerEmail, 'Order confirmed');
}

What can go wrong?

  1. DB OK, Kafka FAIL - order exists, but no other system knows about it
  2. DB OK, Kafka OK, Email FAIL - customer doesn’t receive confirmation
  3. DB timeout after Kafka - message sent, but transaction rolled back

Race Condition in Practice

Even worse - even when everything succeeds:

// Request 1 (slow network)
await db.orders.insert(order);  // t=0ms
// ... network delay ...
await kafka.publish(event);      // t=500ms

// Request 2 (fast)
await db.orders.update(order);  // t=100ms
await kafka.publish(updateEvent); // t=150ms

// Kafka receives: updateEvent (t=150ms), createEvent (t=500ms)
// Consumer sees UPDATE before CREATE!

Why 2PC is Not the Answer

Two-Phase Commit (2PC) theoretically solves distributed transactions:

Coordinator: "Prepare?"
  DB: "Ready"
  Kafka: "Ready"
Coordinator: "Commit!"
  DB: "Committed"
  Kafka: "Committed"

Problems:

  1. Kafka doesn’t support 2PC - most message brokers don’t
  2. Performance - blocking on the slowest system
  3. Availability - if one system goes down, everything stops
  4. Complexity - coordinator failures, timeouts, recovery

Transactional Outbox Pattern

Instead of writing to two systems, write ONLY to the database:

┌─────────────────────────────────────┐
│           PostgreSQL                │
│  ┌─────────────┐  ┌──────────────┐  │
│  │   orders    │  │   outbox     │  │
│  │             │  │              │  │
│  │ id, data... │  │ id, payload  │  │
│  │             │  │ processed_at │  │
│  └─────────────┘  └──────────────┘  │
└─────────────────────────────────────┘

         ▼ (async worker)
    ┌─────────┐     ┌─────────┐
    │  Kafka  │     │  Email  │
    └─────────┘     └─────────┘

Principle:

  1. In a single DB transaction, save order + outbox event
  2. Separate worker reads outbox and sends to external systems
  3. Worker marks event as processed

TypeScript Implementation

Outbox Table

CREATE TABLE outbox (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    aggregate_type VARCHAR(100) NOT NULL,
    aggregate_id VARCHAR(100) NOT NULL,
    event_type VARCHAR(100) NOT NULL,
    payload JSONB NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    processed_at TIMESTAMP WITH TIME ZONE,
    retry_count INT DEFAULT 0,
    last_error TEXT
);

-- Index for worker
CREATE INDEX idx_outbox_unprocessed
ON outbox(created_at)
WHERE processed_at IS NULL;

-- Index for cleanup
CREATE INDEX idx_outbox_processed
ON outbox(processed_at)
WHERE processed_at IS NOT NULL;

Business Logic with Outbox

interface OutboxEvent {
  aggregateType: string;
  aggregateId: string;
  eventType: string;
  payload: unknown;
}

class OrderService {
  constructor(
    private db: Pool,
    private outbox: OutboxRepository
  ) {}

  async createOrder(order: CreateOrderInput): Promise<Order> {
    return await this.db.transaction(async (tx) => {
      // 1. Save order
      const savedOrder = await tx.orders.insert(order);

      // 2. Save event to outbox - in the same transaction!
      await this.outbox.insertInTransaction(tx, {
        aggregateType: 'Order',
        aggregateId: savedOrder.id,
        eventType: 'order.created',
        payload: {
          orderId: savedOrder.id,
          customerId: savedOrder.customerId,
          items: savedOrder.items,
          total: savedOrder.total,
        }
      });

      // 3. Another event for email
      await this.outbox.insertInTransaction(tx, {
        aggregateType: 'Order',
        aggregateId: savedOrder.id,
        eventType: 'notification.send',
        payload: {
          type: 'email',
          to: order.customerEmail,
          template: 'order-confirmation',
          data: { orderId: savedOrder.id }
        }
      });

      return savedOrder;
    });
  }
}

Outbox Worker with LISTEN/NOTIFY

PostgreSQL LISTEN/NOTIFY enables real-time notifications without polling:

-- Trigger for new outbox events
CREATE OR REPLACE FUNCTION notify_outbox_insert()
RETURNS TRIGGER AS $$
BEGIN
    PERFORM pg_notify('outbox_events', NEW.id::text);
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER tr_outbox_notify
    AFTER INSERT ON outbox
    FOR EACH ROW
    EXECUTE FUNCTION notify_outbox_insert();

Worker in TypeScript:

import { Pool, Client } from 'pg';

class OutboxWorker {
  private listenerClient: Client;
  private processingPool: Pool;
  private isRunning = false;

  constructor(
    private handlers: Map<string, EventHandler>,
    connectionString: string
  ) {
    this.listenerClient = new Client(connectionString);
    this.processingPool = new Pool({
      connectionString,
      max: 5  // Parallel processing
    });
  }

  async start(): Promise<void> {
    this.isRunning = true;

    await this.listenerClient.connect();
    await this.listenerClient.query('LISTEN outbox_events');

    this.listenerClient.on('notification', async (msg) => {
      if (msg.channel === 'outbox_events' && msg.payload) {
        await this.processEvent(msg.payload);
      }
    });

    // Process existing unprocessed events on startup
    await this.processBacklog();

    console.log('Outbox worker started');
  }

  private async processEvent(eventId: string): Promise<void> {
    const client = await this.processingPool.connect();

    try {
      // Lock row for exclusive processing
      const result = await client.query(`
        SELECT * FROM outbox
        WHERE id = $1
        AND processed_at IS NULL
        FOR UPDATE SKIP LOCKED
      `, [eventId]);

      if (result.rows.length === 0) {
        return; // Already processed by another worker
      }

      const event = result.rows[0];
      const handler = this.handlers.get(event.event_type);

      if (!handler) {
        console.error(`No handler for event type: ${event.event_type}`);
        return;
      }

      try {
        await handler.handle(event.payload);

        // Mark as processed
        await client.query(`
          UPDATE outbox
          SET processed_at = NOW()
          WHERE id = $1
        `, [eventId]);

      } catch (error) {
        // Retry logic
        await client.query(`
          UPDATE outbox
          SET retry_count = retry_count + 1,
              last_error = $2
          WHERE id = $1
        `, [eventId, error.message]);

        // Dead letter after X attempts
        if (event.retry_count >= 5) {
          await this.moveToDeadLetter(client, event);
        }
      }
    } finally {
      client.release();
    }
  }

  private async processBacklog(): Promise<void> {
    const result = await this.processingPool.query(`
      SELECT id FROM outbox
      WHERE processed_at IS NULL
      ORDER BY created_at
      LIMIT 100
    `);

    for (const row of result.rows) {
      await this.processEvent(row.id);
    }
  }

  async stop(): Promise<void> {
    this.isRunning = false;
    await this.listenerClient.query('UNLISTEN outbox_events');
    await this.listenerClient.end();
    await this.processingPool.end();
  }
}

Event Handlers

interface EventHandler {
  handle(payload: unknown): Promise<void>;
}

class KafkaEventHandler implements EventHandler {
  constructor(private kafka: KafkaProducer) {}

  async handle(payload: OrderCreatedPayload): Promise<void> {
    await this.kafka.send({
      topic: 'orders',
      messages: [{ value: JSON.stringify(payload) }]
    });
  }
}

class EmailEventHandler implements EventHandler {
  constructor(private emailService: EmailService) {}

  async handle(payload: NotificationPayload): Promise<void> {
    await this.emailService.sendTemplate(
      payload.to,
      payload.template,
      payload.data
    );
  }
}

// Register handlers
const handlers = new Map<string, EventHandler>();
handlers.set('order.created', new KafkaEventHandler(kafka));
handlers.set('notification.send', new EmailEventHandler(emailService));

const worker = new OutboxWorker(handlers, DATABASE_URL);
await worker.start();

Guaranteed Event Ordering

Outbox pattern guarantees ordering for a single aggregate:

// In transaction - ordering guaranteed
await this.outbox.insert(tx, { eventType: 'order.created', ... });
await this.outbox.insert(tx, { eventType: 'order.confirmed', ... });
await this.outbox.insert(tx, { eventType: 'order.shipped', ... });

Worker processes in created_at order:

SELECT * FROM outbox
WHERE aggregate_id = $1
AND processed_at IS NULL
ORDER BY created_at
FOR UPDATE SKIP LOCKED

Idempotency and At-Least-Once Delivery

Outbox guarantees at-least-once delivery (event will be delivered at least once). Consumers must be idempotent:

class OrderCreatedConsumer {
  async handle(event: OrderCreatedEvent): Promise<void> {
    // Idempotency check
    const existing = await this.db.processedEvents.findOne({
      eventId: event.id
    });

    if (existing) {
      console.log(`Event ${event.id} already processed, skipping`);
      return;
    }

    // Process event
    await this.db.transaction(async (tx) => {
      await this.processOrder(tx, event);

      // Mark as processed
      await tx.processedEvents.insert({ eventId: event.id });
    });
  }
}

Production Checklist

## Outbox Pattern Checklist

### Database
- [ ] Outbox table with worker indexes
- [ ] LISTEN/NOTIFY trigger
- [ ] Monitoring outbox table size
- [ ] Cleanup job for old processed events

### Worker
- [ ] Graceful shutdown (completes in-progress work)
- [ ] Health check endpoint
- [ ] Retry with exponential backoff
- [ ] Dead letter queue for failed events
- [ ] Metrics: processing time, queue depth, error rate

### Monitoring
- [ ] Alert: outbox queue > N unprocessed
- [ ] Alert: event processing time > X ms
- [ ] Alert: retry_count > threshold
- [ ] Dashboard: throughput, latency, errors

### High Availability
- [ ] Multiple worker instances (SKIP LOCKED)
- [ ] Worker restart policy (K8s, systemd)
- [ ] Database connection pool sizing

When Not to Use Outbox

  1. You don’t need guaranteed delivery - fire-and-forget is OK
  2. Single service - you don’t need distributed transactions
  3. You have an event store - Event Sourcing already solves this
  4. Real-time requirements < 100ms - LISTEN/NOTIFY has latency

Alternatives

Change Data Capture (CDC)

Debezium reads WAL log and generates events:

PostgreSQL WAL → Debezium → Kafka

Pros: No code changes, no outbox table Cons: Additional infrastructure, events are DB-level (not domain events)

Transaction Log Tailing

Similar to CDC, but custom implementation.

Conclusion

Transactional Outbox is a battle-tested pattern for reliable messaging. Key principles:

  1. Single write - everything to database in one transaction
  2. Async processing - worker reads and sends
  3. Idempotent consumers - at-least-once delivery requires deduplication
  4. Monitoring - queue depth is your best problem indicator

For Node.js projects, I recommend PostgreSQL with LISTEN/NOTIFY - minimal latency without polling.

FAQ

What if the worker crashes during processing?

Event remains processed_at = NULL, worker will process it again after restart. That’s why consumers must be idempotent.

How to scale workers?

FOR UPDATE SKIP LOCKED enables parallel processing. Each worker locks and processes different events. Watch out for ordering - if you need strict ordering, use one worker per aggregate.

How many events to keep in outbox?

Delete processed events after 7-30 days (depending on debugging needs). Unprocessed should not exceed hundreds - if there are thousands, you have a problem.

Why not Kafka Transactions?

Kafka Transactions solve exactly-once within Kafka, not between DB and Kafka. Outbox pattern solves DB → anywhere.


Related posts

Cite this article

If you reference this post, please link to the original URL and credit the author.

Michal Drozd. "Transactional Outbox: Solving the Dual Write Problem Without 2PC". https://www.michal-drozd.com/en/blog/transactional-outbox/ (Published March 27, 2025).