Back to blog

Redis Memory Fragmentation: When maxmemory Isn't Enough

Redis said 4GB. The kernel said 6GB. The kernel won. That’s how I learned that maxmemory is a data limit, not an RSS limit.

The gap is memory fragmentation, and it is one of the most misunderstood parts of Redis in production. maxmemory controls how much data Redis stores. RSS (Resident Set Size) is how much physical memory the process uses. When allocations and frees are uneven, the allocator (jemalloc) leaves holes that count toward RSS even if Redis thinks it is under its limit.

What makes this particularly insidious is the pattern that causes it: variable-sized keys with TTL. You set a 1KB session with a 5-minute TTL. You set a 10KB cache entry with a 1-minute TTL. They expire at different times, leaving fragmented holes in the memory allocator’s arenas. New allocations can’t fill the exact-sized holes, so they allocate fresh memory. RSS grows while used_memory stays constant.

The tragic irony is that Redis is often praised for its memory efficiency. And it is—for the data it stores. But the memory allocator’s overhead is separate, and under fragmentation-inducing workloads, that overhead can be 50% or more of your actual data size.

Tested on: Redis 7.2, jemalloc 5.3, Kubernetes with 8GB memory limit

The Hidden Memory

What Redis Reports vs Reality

# Redis INFO memory output
redis-cli INFO memory

used_memory:4294967296          # 4GB - what Redis tracks
used_memory_rss:6442450944      # 6GB - what OS sees!
mem_fragmentation_ratio:1.50    # 50% overhead

# This gap = fragmentation
# OOM killer sees RSS, not used_memory

Why Fragmentation Happens

Memory allocation pattern matters:

Scenario: Variable-sized keys with TTL

1. Allocate 1KB key (jemalloc picks 1KB slab)
2. Allocate 500B key (jemalloc picks 512B slab)
3. Key 1 expires, 1KB slab now has hole
4. New 600B key can't fit in 512B slab, needs new allocation
5. 1KB slab still reserved but partially empty

Over time:
┌─────────────────────────────────────────┐
│ [used][    hole    ][used][  hole  ]    │  ← jemalloc arena
│ [used][used][        hole          ]    │
│ [    hole    ][used][used][  hole  ]    │
└─────────────────────────────────────────┘

RSS = all allocated slabs
used_memory = actual data
Fragmentation = RSS / used_memory

Reproducing the Problem

Test Script

# fragment_redis.py
import redis
import random
import string
import time

r = redis.Redis(host='localhost', port=6379)

def random_value(min_size, max_size):
    size = random.randint(min_size, max_size)
    return ''.join(random.choices(string.ascii_letters, k=size))

# Phase 1: Create variable-sized keys with TTL
print("Phase 1: Creating 1M variable-sized keys...")
for i in range(1_000_000):
    key = f"key:{i}"
    value = random_value(100, 10000)  # 100B to 10KB
    ttl = random.randint(60, 300)      # 1-5 min TTL
    r.setex(key, ttl, value)

    if i % 100000 == 0:
        info = r.info('memory')
        ratio = info['mem_fragmentation_ratio']
        print(f"Keys: {i}, Fragmentation: {ratio:.2f}")

# Phase 2: Wait for TTLs and observe fragmentation
print("\nPhase 2: Waiting for expiration...")
for _ in range(10):
    time.sleep(60)
    info = r.info('memory')
    print(f"used_memory: {info['used_memory_human']}, "
          f"RSS: {info['used_memory_rss_human']}, "
          f"fragmentation: {info['mem_fragmentation_ratio']:.2f}")

Results

Phase 1: Creating 1M variable-sized keys...
Keys: 100000, Fragmentation: 1.05
Keys: 500000, Fragmentation: 1.12
Keys: 1000000, Fragmentation: 1.18

Phase 2: Waiting for expiration...
Minute 1: used_memory: 850MB, RSS: 1.2GB, fragmentation: 1.41
Minute 3: used_memory: 620MB, RSS: 1.1GB, fragmentation: 1.77
Minute 5: used_memory: 380MB, RSS: 980MB, fragmentation: 2.58  # Critical!

# Data shrunk but RSS barely moved
# jemalloc holds onto fragmented memory

Solutions

1. Active Defragmentation (Redis 4.0+)

# redis.conf
activedefrag yes

# Start defrag when fragmentation > 10%
active-defrag-ignore-bytes 100mb
active-defrag-threshold-lower 10

# Stop defrag when fragmentation < 5%
active-defrag-threshold-upper 100

# CPU effort (1-100)
active-defrag-cycle-min 1    # Min CPU% when defragging
active-defrag-cycle-max 25   # Max CPU% when defragging

# Scan limits per cycle
active-defrag-max-scan-fields 1000

How Active Defrag Works

Before defrag:
┌─────────────────────────────────────────┐
│ [A][  hole  ][B][hole][C][   hole    ]  │  Arena 1
│ [D][hole][E][     hole      ][F]        │  Arena 2
└─────────────────────────────────────────┘

Defrag process:
1. Scan for fragmented values
2. Allocate new memory for value
3. Copy data to new location
4. Update pointer atomically
5. Free old memory

After defrag:
┌─────────────────────────────────────────┐
│ [A][B][C][D][E][F]                      │  Arena 1 (compacted)
│              (returned to OS)           │  Arena 2 (freed)
└─────────────────────────────────────────┘

2. Memory Allocator Tuning

# jemalloc background thread for memory return
# Set in redis.conf or environment

# Option 1: Enable jemalloc background threads
redis-server --jemalloc-bg-thread yes

# Option 2: Force memory return to OS
# MEMORY PURGE command (Redis 4.0+)
redis-cli MEMORY PURGE

# Option 3: Tune jemalloc decay time
# Lower = faster memory return, higher CPU
export MALLOC_CONF="background_thread:true,dirty_decay_ms:1000,muzzy_decay_ms:1000"

3. Uniform Value Sizes

# Bad: Variable sizes cause fragmentation
r.set("user:1", json.dumps(small_user))      # 200B
r.set("user:2", json.dumps(large_user))      # 50KB

# Better: Pad to power-of-2 sizes
def pad_value(value, target_size=None):
    data = json.dumps(value)
    if target_size is None:
        # Round up to nearest power of 2
        size = len(data)
        target_size = 1 << (size - 1).bit_length()
    return data.ljust(target_size, '\0')

# Or use separate Redis instances for different size classes
# small_redis: values < 1KB
# large_redis: values > 1KB

4. Kubernetes Memory Configuration

# Don't set memory limit = maxmemory!
apiVersion: v1
kind: Pod
spec:
  containers:
    - name: redis
      resources:
        requests:
          memory: "4Gi"
        limits:
          memory: "6Gi"  # 50% headroom for fragmentation!
      env:
        - name: REDIS_MAXMEMORY
          value: "4gb"
---
# Redis config
apiVersion: v1
kind: ConfigMap
data:
  redis.conf: |
    maxmemory 4gb
    maxmemory-policy allkeys-lru
    activedefrag yes
    active-defrag-threshold-lower 10
    active-defrag-cycle-max 25

Monitoring

Prometheus Metrics

# Redis exporter metrics
- alert: RedisHighFragmentation
  expr: |
    redis_memory_fragmentation_ratio > 1.5
  for: 30m
  labels:
    severity: warning
  annotations:
    summary: "Redis fragmentation ratio {{ $value }}"
    description: "Consider enabling activedefrag"

- alert: RedisFragmentationCritical
  expr: |
    redis_memory_fragmentation_ratio > 2.0
  for: 10m
  labels:
    severity: critical
  annotations:
    summary: "Redis fragmentation critical: {{ $value }}"
    description: "OOM risk - RSS much higher than used_memory"

- alert: RedisRSSNearLimit
  expr: |
    redis_memory_used_rss_bytes / on(instance)
    (container_spec_memory_limit_bytes) > 0.85
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "Redis RSS at {{ $value | humanizePercentage }} of limit"

Grafana Dashboard Queries

# Fragmentation ratio over time
redis_memory_fragmentation_ratio

# Memory breakdown
redis_memory_used_bytes
redis_memory_used_rss_bytes
redis_memory_used_peak_bytes

# Active defrag stats
redis_active_defrag_running
redis_active_defrag_hits
redis_active_defrag_misses
redis_active_defrag_key_hits

Debugging Commands

# Check current fragmentation
redis-cli INFO memory | grep -E "(used_memory|fragmentation)"

# Memory doctor (Redis 4.0+)
redis-cli MEMORY DOCTOR

# Example output:
# "Sam, I have a few reports for you:
#  * Peak memory: 6.2GB, RSS: 8.1GB, Fragmentation: 1.31
#  * High fragmentation: Consider enabling activedefrag"

# Check allocator stats
redis-cli MEMORY MALLOC-SIZE 1024
redis-cli MEMORY STATS

# Force defrag check
redis-cli DEBUG QUICKLIST-PACKED-THRESHOLD 0

# Force memory return (careful in production)
redis-cli MEMORY PURGE

Prevention Strategies

Key Design

# Avoid mixing tiny and huge values
# Bad
r.set("config", "true")                    # 4 bytes
r.set("user:session:123", huge_json)       # 100KB

# Better: Use appropriate data structures
r.hset("config", "feature_x", "true")      # Hash for small values
r.set("session:123", compressed_data)      # Compress large values

# Use consistent TTLs per key type
SESSION_TTL = 3600      # 1 hour for all sessions
CACHE_TTL = 300         # 5 min for all cache entries

Architecture Patterns

Pattern: Size-based sharding

┌─────────────────────────────────────────┐
│               Application               │
└───────────────┬─────────────────────────┘

    ┌───────────┼───────────┐
    ▼           ▼           ▼
┌───────┐  ┌───────┐  ┌───────┐
│ Small │  │ Medium│  │ Large │
│ <1KB  │  │ 1-10KB│  │ >10KB │
│ Redis │  │ Redis │  │ Redis │
└───────┘  └───────┘  └───────┘

Each instance has uniform value sizes = minimal fragmentation

Checklist

## Redis Memory Fragmentation Prevention

### Configuration
- [ ] Enable activedefrag in redis.conf
- [ ] Set threshold-lower to 10%
- [ ] Set cycle-max to 25% (adjust based on CPU budget)
- [ ] Configure maxmemory with 30-50% headroom to container limit

### Monitoring
- [ ] Alert on fragmentation_ratio > 1.5
- [ ] Alert on RSS approaching container limit
- [ ] Dashboard showing used_memory vs RSS

### Key Design
- [ ] Use consistent value sizes where possible
- [ ] Compress large values before storing
- [ ] Use appropriate data structures (hashes for small values)

### Operations
- [ ] Schedule MEMORY PURGE during low traffic (if not using activedefrag)
- [ ] Monitor activedefrag hits/misses
- [ ] Consider size-based sharding for extreme cases

Conclusion

Redis memory management is a perfect example of how abstractions can be misleading. maxmemory sounds like it controls how much memory Redis uses. It doesn’t. It controls how much data Redis stores. The actual memory usage—what the OOM killer sees—includes the data plus all the overhead from the memory allocator’s internal bookkeeping, fragmentation, and reserved-but-unused space.

The core insight is that mem_fragmentation_ratio is the metric that reveals the truth. A ratio of 1.0 means RSS equals used_memory—perfect efficiency, rarely achieved. A ratio of 1.5 means you’re using 50% more memory than your data size. A ratio of 2.0 or higher is critical—you’re approaching OOM territory even though Redis thinks it has plenty of room.

Active defragmentation is the solution Redis provides, but it has costs. Defragmentation consumes CPU while it copies data to compact the memory layout. For latency-sensitive workloads, you might prefer to simply size your container with enough headroom that fragmentation doesn’t cause OOM. The choice depends on whether you’re optimizing for cost (smaller containers with defrag) or latency (larger containers without).

Key principles:

  1. Fragmentation ratio reveals real memory overhead—RSS / used_memory tells you the actual cost
  2. Variable-sized keys with TTL cause the worst fragmentation—different-sized holes left at different times
  3. Active defragmentation compacts memory automatically—enable it unless you’re latency-sensitive
  4. Set container limits 30-50% above maxmemory—leave room for the overhead you can’t avoid
  5. Monitor mem_fragmentation_ratio continuously—it’s your leading indicator of OOM risk

Your next OOM might be hiding in the gap between used_memory and RSS. Check the fragmentation ratio now.


Related posts

Cite this article

If you reference this post, please link to the original URL and credit the author.

Michal Drozd. "Redis Memory Fragmentation: When maxmemory Isn't Enough". https://www.michal-drozd.com/en/blog/redis-memory-fragmentation/ (Published May 22, 2025).