TCP TIME_WAIT Port Exhaustion: When Connection Pooling Isn't Enough

The day we ran out of ports, I finally respected TIME_WAIT. “Suddenly can’t connect to database - address already in use.” The error made no sense. The database was healthy. The network was fine. Our connection pool was configured correctly—we checked three times. Yet our service was throwing connection errors during a traffic spike, claiming it couldn’t bind to an address.

The answer was hiding in ss -tan state time-wait: 27,000 sockets. Nearly all of our ephemeral port range was consumed by TCP TIME_WAIT sockets—connections that had closed but were being held open by the kernel for the mandated 60-second safety period. Each socket blocked a source-port combination from being reused. When we ran out of ports, we couldn’t create new connections to the database.

What made this particularly frustrating was that we had a connection pool. We’d done everything right—or so we thought. But somewhere in the codebase, a developer had created a new database client for each request instead of using the shared pool. The pool was configured correctly; it just wasn’t being used. Every request opened a new connection, used it once, and closed it. The connection was gone from the application’s perspective, but the kernel kept the socket in TIME_WAIT for 60 seconds.

TIME_WAIT is a fascinating example of TCP’s conservative design. It exists to prevent old packets from a closed connection from corrupting a new connection that reuses the same 4-tuple (source IP, source port, destination IP, destination port). It’s a safety feature. But at high throughput, it becomes a resource exhaustion vector. The math is simple: if you close 500 connections per second to the same destination, and each connection holds a port for 60 seconds, you need 30,000 ports—more than the default ephemeral range provides.

Environment: High-throughput services, microservices with many outbound connections, connection pool misconfigurations, short-lived HTTP connections

The Problem

The Mysterious Connection Failures

Timeline of port exhaustion:

T+0:00   Service running normally
         Outbound connections to DB, cache, APIs
         Ephemeral port range: 32768-60999 (28,231 ports)

T+0:10   Traffic spike - 1000 req/sec
         Each request makes 3 outbound calls
         3000 connections/sec opened and closed

T+0:30   TIME_WAIT sockets accumulating
         Each stays for 60 seconds
         3000 × 60 = 180,000 sockets needed!

T+0:35   Error: cannot assign requested address
         All ephemeral ports to DB IP:port exhausted
         New connections impossible

What TIME_WAIT Actually Is

TCP Connection Lifecycle:

Client                              Server
   |                                   |
   |-------- SYN ------------------>  |
   |<------- SYN-ACK --------------- |
   |-------- ACK ------------------>  |
   |           ESTABLISHED             |
   |<======= DATA ==================>  |
   |                                   |
   |-------- FIN ------------------>  |  Client initiates close
   |<------- ACK ------------------- |
   |<------- FIN ------------------- |
   |-------- ACK ------------------>  |
   |                                   |
   |  TIME_WAIT (2 × MSL = 60s)       |  ← Socket unusable!
   |                                   |
   ↓  Finally closed                   |

Why TIME_WAIT exists:
1. Ensure final ACK reaches server
2. Let duplicate packets expire
3. Prevent old packets corrupting new connection

Problem: High-throughput = many TIME_WAITs

Root Cause

The 4-Tuple Problem

TCP socket identified by 4-tuple:
(source_ip, source_port, dest_ip, dest_port)

Your service:    10.0.1.50
Database:        10.0.2.100:5432

Available combinations:
10.0.1.50:32768 → 10.0.2.100:5432
10.0.1.50:32769 → 10.0.2.100:5432
...
10.0.1.50:60999 → 10.0.2.100:5432

Only 28,231 unique 4-tuples possible!

At 500 connections/sec with 60s TIME_WAIT:
500 × 60 = 30,000 sockets needed
30,000 > 28,231 available → EXHAUSTION

Check Your Socket State

# Count sockets by state
ss -tan | awk '{print $1}' | sort | uniq -c | sort -rn

# Output:
# 24567 TIME-WAIT
#  2341 ESTABLISHED
#   234 LISTEN
#    45 FIN-WAIT-2

# TIME_WAIT to specific destination
ss -tan state time-wait | grep "10.0.2.100:5432" | wc -l
# 23456 ← Almost all ports used!

# Check ephemeral port range
cat /proc/sys/net/ipv4/ip_local_port_range
# 32768   60999

The Real Math

# Calculate maximum sustainable rate
# Ephemeral ports: 60999 - 32768 = 28,231
# TIME_WAIT duration: 60 seconds
# Max new connections/sec: 28,231 / 60 = 470/sec per destination

# If you have 10 unique destinations:
# Max total: 4,700 new connections/sec

# But if all go to ONE destination (your DB):
# Max: 470 connections/sec sustained

# Reality check your workload:
ss -tan state time-wait dst 10.0.2.100:5432 | wc -l
# If near 28,231 → you're at the limit

Diagnosis

Identify the Bottleneck

# Find top TIME_WAIT destinations
ss -tan state time-wait | awk '{print $4}' | sort | uniq -c | sort -rn | head
#  23456 10.0.2.100:5432    ← Database
#   3421 10.0.3.50:6379     ← Redis
#   1234 10.0.4.100:80      ← API service

# If one destination dominates → that's your bottleneck

# Check for connection pool bypass
netstat -an | grep "10.0.2.100:5432" | grep -c ESTABLISHED
# Should be stable (pool size), not fluctuating

Application-Level Diagnosis

# Trace connection creation (Java example)
jcmd <pid> VM.native_memory summary | grep -A5 "Internal"

# Check HikariCP pool stats
curl localhost:8080/actuator/metrics/hikaricp.connections.active
curl localhost:8080/actuator/metrics/hikaricp.connections.idle

# If connections created > pool size
# → Pool is bypassed or misconfigured

The Fix

Option 1: Reuse Connections (Best)

# HikariCP - proper pool sizing
spring:
  datasource:
    hikari:
      maximum-pool-size: 20
      minimum-idle: 10
      connection-timeout: 30000
      idle-timeout: 600000
      max-lifetime: 1800000
      # KEY: Don't create new connections for every query!

// Go - configure connection pool
db, _ := sql.Open("postgres", connStr)
db.SetMaxOpenConns(20)
db.SetMaxIdleConns(10)
db.SetConnMaxLifetime(30 * time.Minute)
db.SetConnMaxIdleTime(10 * time.Minute)

# Python - SQLAlchemy pool
engine = create_engine(
    "postgresql://...",
    pool_size=20,
    max_overflow=10,
    pool_pre_ping=True,
    pool_recycle=1800
)

Option 2: HTTP Keep-Alive

// Go HTTP client - reuse connections
client := &http.Client{
    Transport: &http.Transport{
        MaxIdleConns:        100,
        MaxIdleConnsPerHost: 100,
        IdleConnTimeout:     90 * time.Second,
        // Reuse TCP connections for multiple requests
    },
}

// WRONG: Creating new client per request
func badHandler(w http.ResponseWriter, r *http.Request) {
    client := &http.Client{}  // New client = new connections!
    resp, _ := client.Get("http://api/endpoint")
}

// CORRECT: Reuse client
var httpClient = &http.Client{...}  // Package-level

func goodHandler(w http.ResponseWriter, r *http.Request) {
    resp, _ := httpClient.Get("http://api/endpoint")
}

Option 3: TCP Tuning (Careful!)

# Expand ephemeral port range
echo "1024 65535" > /proc/sys/net/ipv4/ip_local_port_range
# Now 64,511 ports instead of 28,231

# Enable TIME_WAIT reuse (requires timestamps)
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
# Allows reusing TIME_WAIT sockets for new outbound connections
# SAFE for client-side connections

# WARNING: tcp_tw_recycle is DANGEROUS and removed in Linux 4.12
# It breaks connections through NAT
# NEVER use tcp_tw_recycle

# Reduce TIME_WAIT duration (not recommended)
# Linux doesn't support changing this directly
# Would require kernel recompilation

# Kubernetes sysctl tuning
apiVersion: v1
kind: Pod
spec:
  securityContext:
    sysctls:
    - name: net.ipv4.ip_local_port_range
      value: "1024 65535"
    - name: net.ipv4.tcp_tw_reuse
      value: "1"

Option 4: Multiple Source IPs

# If connecting to single destination, add source IPs
# Each source IP gets its own port range

ip addr add 10.0.1.51/24 dev eth0
ip addr add 10.0.1.52/24 dev eth0

# Configure application to rotate source IPs
# Effective port range multiplied by number of IPs

// Go - bind to specific source IP
dialer := &net.Dialer{
    LocalAddr: &net.TCPAddr{
        IP: net.ParseIP("10.0.1.51"),
    },
}

transport := &http.Transport{
    DialContext: dialer.DialContext,
}

Option 5: SO_LINGER (Last Resort)

// Force immediate socket close (dangerous!)
conn, _ := net.Dial("tcp", "10.0.2.100:5432")
tcpConn := conn.(*net.TCPConn)

// Set linger to 0 = send RST instead of FIN
// Avoids TIME_WAIT but can lose data!
tcpConn.SetLinger(0)
tcpConn.Close()

// WARNING: This can cause:
// - Lost data if send buffer not empty
// - Server receives RST (connection reset)
// - Only use for read-only connections or non-critical

Monitoring

groups:
  - name: tcp-exhaustion
    rules:
      - alert: TimeWaitSocketsHigh
        expr: |
          node_sockstat_TCP_tw > 20000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "{{ $value }} sockets in TIME_WAIT"

      - alert: EphemeralPortsLow
        expr: |
          (node_sockstat_TCP_tw + node_sockstat_TCP_alloc) /
          (node_nf_conntrack_entries_limit) > 0.8
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Ephemeral port exhaustion imminent"

      - alert: ConnectionPoolBypass
        expr: |
          rate(hikaricp_connections_creation_seconds_count[5m]) > 10
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High connection creation rate - pool may be bypassed"

# Quick monitoring script
watch -n 5 'echo "TIME_WAIT:"; ss -tan state time-wait | wc -l;
echo "ESTABLISHED:"; ss -tan state established | wc -l;
echo "Top destinations:"; ss -tan state time-wait | awk "{print \$4}" | sort | uniq -c | sort -rn | head -5'

Checklist

## TCP TIME_WAIT Port Exhaustion

### Diagnosis
- [ ] Count TIME_WAIT sockets: ss -tan state time-wait | wc -l
- [ ] Identify top destinations by TIME_WAIT count
- [ ] Check ephemeral port range: cat /proc/sys/net/ipv4/ip_local_port_range
- [ ] Calculate max sustainable rate per destination

### Application Fixes (Do First)
- [ ] Verify connection pool is configured
- [ ] Check pool is actually being used (not bypassed)
- [ ] Enable HTTP keep-alive for API calls
- [ ] Reuse HTTP clients across requests

### System Tuning (If Needed)
- [ ] Expand ephemeral port range: 1024-65535
- [ ] Enable tcp_tw_reuse (safe for clients)
- [ ] Consider multiple source IPs for single destination
- [ ] NEVER use tcp_tw_recycle

### Monitoring
- [ ] Alert on TIME_WAIT socket count
- [ ] Alert on connection creation rate
- [ ] Monitor port range utilization

Conclusion

TCP TIME_WAIT is one of those features that’s invisible until it breaks you. It exists for good reasons—preventing packet corruption across connection reuse—but at high throughput, those 60 seconds of socket hold time accumulate into resource exhaustion. The symptom is “cannot assign requested address,” which doesn’t obviously point to TIME_WAIT. You have to know to check socket states.

The fundamental fix is almost always connection reuse. If you’re pooling connections properly, you don’t create and destroy thousands of sockets per second. The pool maintains persistent connections, reuses them for multiple requests, and TIME_WAIT never accumulates. The problem only emerges when pooling is misconfigured or bypassed—and it’s remarkably easy to bypass a pool accidentally in code.

The system tuning options—expanding port range, enabling tcp_tw_reuse—are legitimate but secondary. They raise the ceiling but don’t fix the underlying issue. If your application is creating connections faster than it can reuse ports, expanding the port range just delays the exhaustion. Fix the application first; tune the system if you still need headroom.

Key principles:

Connection pooling is the fix—reuse connections, don’t create new ones for each request
TIME_WAIT exists for safety—don’t try to eliminate it, work around it with reuse
tcp_tw_reuse is safe for client-side connections with timestamps enabled
tcp_tw_recycle is dangerous—removed from modern kernels because it breaks NAT
Monitor TIME_WAIT counts—they’re your early warning for connection misuse

Check your TIME_WAIT count now. If it’s in the thousands and climbing, your connection pools might not be doing what you think they’re doing.

Database Connection Pool Exhaustion - Pool configuration issues
Kubernetes Service Connection Issues - K8s networking problems

TCP TIME_WAIT Port Exhaustion: When Connection Pooling Isn't Enough

The Problem

The Mysterious Connection Failures

What TIME_WAIT Actually Is

Root Cause

The 4-Tuple Problem

Check Your Socket State

The Real Math

Diagnosis

Identify the Bottleneck

Application-Level Diagnosis

The Fix

Option 1: Reuse Connections (Best)

Option 2: HTTP Keep-Alive

Option 3: TCP Tuning (Careful!)

Option 4: Multiple Source IPs

Option 5: SO_LINGER (Last Resort)

Monitoring

Checklist

Conclusion

Related posts

Cite this article

The Problem

The Mysterious Connection Failures

What TIME_WAIT Actually Is

Root Cause

The 4-Tuple Problem

Check Your Socket State

The Real Math

Diagnosis

Identify the Bottleneck

Application-Level Diagnosis

The Fix

Option 1: Reuse Connections (Best)

Option 2: HTTP Keep-Alive

Option 3: TCP Tuning (Careful!)

Option 4: Multiple Source IPs

Option 5: SO_LINGER (Last Resort)

Monitoring

Checklist

Conclusion

Related Articles

Related posts

Cite this article