JVM Native Memory in Kubernetes: Why Your Pod Gets OOMKilled with 50% Heap
We kept tuning the heap and kept getting OOMKilled. “Pod OOMKilled at 2GB memory limit, but heap was only 1GB and 50% used.” Where did the other 1GB go?
JVM uses more than just heap. Native memory includes Metaspace, thread stacks, NIO buffers, JIT compiled code, and more. In containers, this often causes OOMKilled.
Tested on: Java 21, Kubernetes 1.28, Spring Boot 3.2, container limit 2GB
JVM Memory Anatomy
Total Memory = Heap + Non-Heap
Container Memory (2GB limit)
├── Heap (-Xmx) ~1GB
│ ├── Young Generation
│ ├── Old Generation
│ └── [Controlled by -Xmx]
│
├── Metaspace ~100-300MB
│ ├── Class metadata
│ ├── Method metadata
│ └── [Controlled by -XX:MaxMetaspaceSize]
│
├── Thread Stacks ~200-500MB
│ ├── Each thread: ~1MB default
│ ├── 200 threads = 200MB
│ └── [Controlled by -Xss]
│
├── Code Cache ~50-240MB
│ ├── JIT compiled code
│ └── [Controlled by -XX:ReservedCodeCacheSize]
│
├── Direct Buffers (NIO) ~variable
│ ├── Off-heap for I/O
│ └── [Controlled by -XX:MaxDirectMemorySize]
│
├── Native Libraries ~variable
│ ├── JNI allocations
│ └── Native code
│
└── Other ~50-100MB
├── GC structures
├── Symbol tables
└── Internal structures
Diagnosing the Problem
Enable Native Memory Tracking
# Start JVM with NMT
java -XX:NativeMemoryTracking=summary \
-jar app.jar
# Check memory breakdown
jcmd <pid> VM.native_memory summary
# Or detailed breakdown
jcmd <pid> VM.native_memory detail
NMT Output Example
Native Memory Tracking:
Total: reserved=3145728KB, committed=2097152KB ← Committed = actually used
- Java Heap (reserved=1048576KB, committed=1048576KB)
(mmap: reserved=1048576KB, committed=1048576KB)
- Class (reserved=312456KB, committed=289345KB)
(classes #25678)
( instance classes #24567, array classes #1111)
- Thread (reserved=215678KB, committed=215678KB)
(thread #210)
(stack: reserved=214567KB, committed=214567KB)
- Code (reserved=253456KB, committed=178934KB)
(mmap: reserved=253456KB, committed=178934KB)
- GC (reserved=89567KB, committed=89567KB)
- Internal (reserved=56789KB, committed=56789KB)
- Symbol (reserved=23456KB, committed=23456KB)
Problem: Math Doesn’t Add Up
Container limit: 2GB
Heap (-Xmx): 1GB
Expected overhead: ~500MB
Total expected: ~1.5GB
Actual usage:
Heap: 1GB
Metaspace: 280MB (25k classes)
Threads: 210MB (210 threads)
Code Cache: 180MB
Direct Buffers: 256MB (Netty)
Other: 150MB
Total: 2.08GB → OOMKilled!
Solutions
1. Calculate Container Memory Properly
# Kubernetes deployment
apiVersion: apps/v1
kind: Deployment
spec:
containers:
- name: app
resources:
limits:
memory: "2Gi"
env:
- name: JAVA_OPTS
value: >-
-Xmx1g
-Xms1g
-XX:MaxMetaspaceSize=256m
-XX:MaxDirectMemorySize=256m
-Xss512k
-XX:ReservedCodeCacheSize=128m
Memory Budget Calculation
Container Limit: 2GB
Allocations:
Heap (-Xmx): 1000MB
Metaspace: 256MB
Thread stacks (200 × 512KB): 100MB
Code Cache: 128MB
Direct Memory: 256MB
GC + Internal: 150MB
Safety buffer: 110MB
-------
Total: 2000MB ✓
2. Use Container-Aware JVM Settings
# Java 17+ - automatic container detection
env:
- name: JAVA_OPTS
value: >-
-XX:MaxRAMPercentage=75.0
-XX:InitialRAMPercentage=75.0
-XX:MaxMetaspaceSize=256m
-Xss512k
# This sets Xmx to 75% of container limit automatically
# Leaves 25% for non-heap
3. Limit Thread Count
// Spring Boot - limit Tomcat threads
// application.yml
server:
tomcat:
threads:
max: 100 # Down from 200 default
min-spare: 10
// Or for async workers
spring:
task:
execution:
pool:
max-size: 50
4. Control Direct Memory
// Check direct memory usage
import java.lang.management.ManagementFactory;
import java.lang.management.BufferPoolMXBean;
List<BufferPoolMXBean> pools = ManagementFactory.getPlatformMXBeans(BufferPoolMXBean.class);
for (BufferPoolMXBean pool : pools) {
System.out.println(pool.getName() + ": " + pool.getMemoryUsed() / 1024 / 1024 + "MB");
}
// Output: direct: 256MB
# Limit direct memory
-XX:MaxDirectMemorySize=256m
# Netty specific
-Dio.netty.maxDirectMemory=0 # Use Xmx for limit
Monitoring Native Memory
Prometheus Metrics
// Micrometer metrics for non-heap
@Bean
MeterBinder jvmNativeMemory() {
return registry -> {
Gauge.builder("jvm.memory.native.used", () -> {
// From NMT if enabled
// Or estimate from /proc/self/status
return parseVmRSS();
}).register(registry);
};
}
private long parseVmRSS() {
try {
String status = Files.readString(Path.of("/proc/self/status"));
// Parse VmRSS line
return extractVmRSS(status);
} catch (IOException e) {
return 0;
}
}
Container Memory vs JVM Memory
# Container memory (from cAdvisor)
container_memory_usage_bytes{pod="myapp-xyz"}
# JVM heap used
jvm_memory_used_bytes{area="heap"}
# Non-heap = container - heap
container_memory_usage_bytes - on(pod) jvm_memory_used_bytes{area="heap"}
Grafana Dashboard
{
"panels": [
{
"title": "Memory Breakdown",
"targets": [
{"expr": "jvm_memory_used_bytes{area='heap'}", "legendFormat": "Heap"},
{"expr": "jvm_memory_used_bytes{area='nonheap'}", "legendFormat": "Non-Heap (partial)"},
{"expr": "container_memory_usage_bytes", "legendFormat": "Container Total"}
]
}
]
}
Common Memory Leaks
1. Class Loader Leaks (Metaspace)
// Problem: Dynamic class generation
// Groovy scripts, JAXB, reflection proxies
// Detection
jcmd <pid> VM.native_memory detail | grep -A5 "Class"
// Fix: Limit class generation, reuse classloaders
-XX:MaxMetaspaceSize=256m // Hard limit
2. Thread Leaks
// Problem: Threads never terminated
ExecutorService executor = Executors.newCachedThreadPool();
// Threads grow unbounded
// Detection
jcmd <pid> Thread.print | grep -c "java.lang.Thread"
// Fix: Use bounded pools
ExecutorService executor = Executors.newFixedThreadPool(50);
3. Direct Buffer Leaks
// Problem: NIO buffers not released
ByteBuffer buffer = ByteBuffer.allocateDirect(1024 * 1024);
// Not explicitly freed, waits for GC
// Detection
jcmd <pid> VM.native_memory summary | grep -A2 "Internal"
// Fix: Manual cleanup or System.gc() hint
((DirectBuffer) buffer).cleaner().clean();
4. JNI Memory Leaks
// Problem: Native code allocates, never frees
// Common with image processing libraries
// Detection
// NMT shows "Other" growing
// Fix: Check native library documentation
// Use try-with-resources for native resources
Production Configuration
Recommended JVM Flags
# For 2GB container
java \
# Heap: 50% of container
-Xmx1g \
-Xms1g \
\
# Metaspace: bounded
-XX:MaxMetaspaceSize=256m \
\
# Threads: smaller stacks
-Xss512k \
\
# Code Cache: bounded
-XX:ReservedCodeCacheSize=128m \
\
# Direct Memory: bounded
-XX:MaxDirectMemorySize=256m \
\
# Container awareness
-XX:+UseContainerSupport \
\
# NMT for debugging (slight overhead)
-XX:NativeMemoryTracking=summary \
\
# GC logging
-Xlog:gc*:file=/logs/gc.log:time \
\
-jar app.jar
Alert Rules
groups:
- name: jvm_memory
rules:
- alert: ContainerMemoryNearLimit
expr: |
container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.85
for: 5m
annotations:
summary: "Container {{ $labels.pod }} at >85% memory"
- alert: NonHeapMemoryHigh
expr: |
(container_memory_usage_bytes - jvm_memory_used_bytes{area="heap"})
/ container_spec_memory_limit_bytes > 0.4
for: 10m
annotations:
summary: "Non-heap memory >40% of container limit"
Checklist
## JVM Container Memory Sizing
### Calculation
- [ ] Determine container memory limit
- [ ] Set heap to 50-60% of limit
- [ ] Budget Metaspace (check class count)
- [ ] Budget threads (count × stack size)
- [ ] Budget direct memory (Netty, NIO)
- [ ] Leave 10-15% safety buffer
### JVM Flags
- [ ] Set -Xmx and -Xms explicitly
- [ ] Set -XX:MaxMetaspaceSize
- [ ] Set -Xss (reduce from 1MB default)
- [ ] Set -XX:MaxDirectMemorySize
- [ ] Enable -XX:+UseContainerSupport
### Monitoring
- [ ] Track container_memory_usage_bytes
- [ ] Track jvm_memory_used_bytes (heap + non-heap)
- [ ] Enable NMT in staging for debugging
- [ ] Alert on >85% container memory
### Testing
- [ ] Load test with production-like load
- [ ] Monitor memory over 24+ hours
- [ ] Verify no OOMKilled under stress
Conclusion
JVM in containers needs explicit memory management:
- Heap is not total memory - Non-heap can be 500MB+
- Budget all components - Metaspace, threads, direct buffers
- Set limits explicitly - Don’t rely on defaults
- Monitor container memory - Not just JVM heap
Container limit - 25% buffer = Maximum safe -Xmx.
Related Articles
- Go GOMAXPROCS in Containers - Container runtime tuning
- K8s CPU Throttling Autopsy - Resource limits
Related posts
RSS Contracts: Stop OOMKilled Java Pods in Kubernetes by Testing RSS as an API
Use cgroup RSS budgets, CI sampling, and runtime headroom to catch JVM memory regressions before they hit production.
Java OOMKilled With Stable Heap: Native Memory, Direct Buffers, and glibc Arenas
Heap metrics look fine, GC is happy, but the container keeps dying. The culprit: native memory from direct buffers, JNI, and glibc memory allocator fragmentation.
JVM Metaspace OOM in Kubernetes: Why MaxMetaspaceSize Alone Won't Save You
Pod OOMKilled despite MaxMetaspaceSize set. The cause: Metaspace grows outside heap, container memory limit doesn't account for it, and class unloading isn't happening.
Linux Page Cache Thrashing in Containers: When Free Memory Isn't Free
Your container has 2GB free but runs slow. Page cache counts against memory limit. File I/O forces code pages out. I explain with benchmarks and solutions.
Cite this article
If you reference this post, please link to the original URL and credit the author.