Java Profiling in Hardened Kubernetes: When Security Blocks Your Debugger
Profiling in hardened clusters feels like debugging through a keyhole. “I need to profile this production issue but nothing works.” We had a memory leak in production—clear from the metrics, obvious in its symptoms—but every profiling tool I tried failed. async-profiler couldn’t open perf events. JFR native profiling was blocked. Even basic thread dumps via jdb were rejected by the kernel.
The problem was our security-hardened Kubernetes environment. seccomp blocked perf_event_open and ptrace syscalls. Containers ran with all capabilities dropped. Filesystems were read-only. This is exactly what security best practices recommend, and it made debugging nearly impossible.
This experience taught me that observability and security are often in tension, and you need to plan for debugging before you deploy. The tools that work in a development environment—attach a profiler, dump threads, run a debugger—may be completely blocked in production. You need alternative approaches that work within security constraints.
The good news is that Java has built-in profiling that doesn’t need elevated privileges. JFR (Java Flight Recorder) with pure JVM sampling works everywhere. Thread dumps always work. JMX remote access provides real-time monitoring. You just need to configure them at deployment time, not during an incident.
Environment: Java 17+, Kubernetes with PodSecurityStandards, seccomp profiles, read-only root filesystem
The Problem
Everything Is Blocked
# Attempt 1: async-profiler
java -agentpath:/profiler/libasyncProfiler.so ...
# Error: perf_event_open failed: Operation not permitted
# Attempt 2: JFR with perf events
jcmd <pid> JFR.start settings=profile
# Warning: Native profiling not available (requires elevated privileges)
# Attempt 3: Attach debugger
jdb -attach 5005
# Error: ptrace operation not permitted
# Attempt 4: eBPF-based profiling
./profile -p <pid>
# Error: bpf() syscall blocked by seccomp
# What's blocking everything?
The Security Layers
# Layer 1: seccomp profile blocks syscalls
apiVersion: v1
kind: Pod
spec:
securityContext:
seccompProfile:
type: RuntimeDefault # Blocks perf_event_open, ptrace, bpf
# Layer 2: Capabilities dropped
containers:
- name: app
securityContext:
capabilities:
drop: ["ALL"]
# CAP_SYS_PTRACE, CAP_PERFMON, CAP_BPF all dropped
# Layer 3: Read-only filesystem
readOnlyRootFilesystem: true
# Can't write profiler output or temp files
# Layer 4: Non-root user
runAsNonRoot: true
runAsUser: 1000
# Many profilers assume root access
What Still Works
JFR Without Native Profiling
# JFR works even in hardened environments!
# It uses JVM-internal sampling, not perf_event_open
# Start recording (via jcmd or JMX)
jcmd <pid> JFR.start duration=60s filename=/tmp/recording.jfr
# Or via JVM flags at startup
java -XX:StartFlightRecording=duration=60s,filename=/tmp/recording.jfr ...
# Key insight: JFR's "profile" setting uses perf events
# But "default" setting uses pure JVM sampling
jcmd <pid> JFR.start settings=default filename=/tmp/recording.jfr
Getting JFR Data Out of Container
# Problem: readOnlyRootFilesystem blocks /tmp writes
# Solution 1: Write to emptyDir volume
volumes:
- name: profiler-output
emptyDir: {}
volumeMounts:
- name: profiler-output
mountPath: /profiler-data
# Then:
jcmd <pid> JFR.start filename=/profiler-data/recording.jfr
# Copy out:
kubectl cp pod-name:/profiler-data/recording.jfr ./recording.jfr
# Solution 2: Stream to stdout (clever hack)
jcmd <pid> JFR.dump name=1 filename=/dev/stdout | base64 > recording.b64
JVM Built-in Sampling
// ThreadMXBean sampling - no special permissions needed
import java.lang.management.ThreadMXBean;
import java.lang.management.ManagementFactory;
public class SimpleSampler {
public static void sample(int durationSeconds, int intervalMs) {
ThreadMXBean tmx = ManagementFactory.getThreadMXBean();
Map<String, Integer> stackCounts = new HashMap<>();
long end = System.currentTimeMillis() + (durationSeconds * 1000L);
while (System.currentTimeMillis() < end) {
for (ThreadInfo ti : tmx.dumpAllThreads(false, false)) {
String stack = Arrays.stream(ti.getStackTrace())
.limit(10)
.map(StackTraceElement::toString)
.collect(Collectors.joining("\n"));
stackCounts.merge(stack, 1, Integer::sum);
}
Thread.sleep(intervalMs);
}
// Output as simple flame graph format
stackCounts.forEach((stack, count) ->
System.out.println(stack.replace("\n", ";") + " " + count));
}
}
The Fixes
Option 1: Sidecar Profiler with Elevated Permissions
# Add a privileged sidecar just for profiling
# Only deployed when needed, removed after
apiVersion: v1
kind: Pod
spec:
shareProcessNamespace: true # Sidecar can see main container's processes
containers:
- name: app
image: my-app:latest
securityContext:
runAsNonRoot: true
readOnlyRootFilesystem: true
- name: profiler
image: async-profiler:latest
securityContext:
capabilities:
add: ["SYS_PTRACE", "PERFMON"] # Just what's needed
command: ["sleep", "infinity"]
volumeMounts:
- name: profiler-output
mountPath: /output
# Profile from sidecar:
# kubectl exec -it pod-name -c profiler -- \
# /profiler/profiler.sh -d 30 -f /output/flamegraph.html <pid>
Option 2: Ephemeral Debug Container
# Kubernetes 1.23+ supports ephemeral containers
kubectl debug pod-name -it --image=async-profiler:latest \
--target=app \
--profile=sysadmin # Adds necessary capabilities
# Inside debug container:
/profiler/profiler.sh -d 30 -f /tmp/flamegraph.html 1
Option 3: Pre-Configured JFR at Startup
# Configure JFR in deployment - no runtime attachment needed
containers:
- name: app
image: my-app:latest
env:
- name: JAVA_TOOL_OPTIONS
value: >-
-XX:StartFlightRecording=
disk=true,
dumponexit=true,
filename=/profiler-data/recording.jfr,
maxsize=100m,
settings=default
volumeMounts:
- name: profiler-output
mountPath: /profiler-data
Option 4: JMX Remote Access
# Enable JMX for remote profiling tools
containers:
- name: app
env:
- name: JAVA_TOOL_OPTIONS
value: >-
-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.port=9010
-Dcom.sun.management.jmxremote.rmi.port=9010
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Djava.rmi.server.hostname=127.0.0.1
ports:
- containerPort: 9010
name: jmx
# Port forward and connect with VisualVM/JMC:
# kubectl port-forward pod-name 9010:9010
# jmc # Connect to localhost:9010
Option 5: Continuous Profiling Service
# Use Pyroscope or similar for always-on profiling
# Agent uses JFR under the hood - no elevated permissions
containers:
- name: app
env:
- name: JAVA_TOOL_OPTIONS
value: >-
-javaagent:/pyroscope/pyroscope.jar
-Dpyroscope.serverAddress=http://pyroscope.monitoring:4040
-Dpyroscope.applicationName=my-app
-Dpyroscope.profilingInterval=10ms
Analyzing Without Full Profiler
Thread Dump Analysis
# Thread dumps always work
jcmd <pid> Thread.print > threads.txt
# Or via kill signal
kill -3 <pid> # Outputs to stderr
# Multiple thread dumps + analysis
for i in {1..10}; do
jcmd <pid> Thread.print >> threads.txt
sleep 1
done
# Analyze with fastthread.io or similar
Heap Dump for Memory Issues
# Heap dumps work without special permissions
jcmd <pid> GC.heap_dump /profiler-data/heap.hprof
# Analyze with Eclipse MAT, VisualVM, or jhat
GC Log Analysis
# Enable detailed GC logging at startup
env:
- name: JAVA_TOOL_OPTIONS
value: >-
-Xlog:gc*=info:file=/profiler-data/gc.log:time,uptime,level,tags
# Analyze with GCViewer, GCEasy, or gceasy.io
Checklist
## Java Profiling in Hardened K8s
### Before Deployment
- [ ] Enable JFR at startup with -XX:StartFlightRecording
- [ ] Configure JMX remote access
- [ ] Add emptyDir volume for profiler output
- [ ] Include profiling agent in image (Pyroscope, etc.)
### During Incident
- [ ] Try jcmd JFR.start (works without privileges)
- [ ] Collect thread dumps (always works)
- [ ] Use kubectl debug for ephemeral container
- [ ] Port-forward JMX and use remote tools
### If Native Profiling Needed
- [ ] Deploy profiler sidecar with SYS_PTRACE
- [ ] Use shareProcessNamespace: true
- [ ] Remove sidecar after profiling complete
Conclusion
This is fundamentally a planning problem, not a security problem. Security hardening is correct—you should drop capabilities, use seccomp, and run with read-only filesystems. The mistake is not planning for debugging within those constraints.
The key insight is that Java has excellent built-in observability that doesn’t require elevated privileges. JFR is remarkable—it provides CPU profiling, memory allocation tracking, GC analysis, and thread monitoring using pure JVM mechanisms. Thread dumps work on any JVM. JMX provides real-time access to JVM internals. None of these need ptrace or perf_event_open.
The failure mode is assuming you can attach tools at runtime. In a hardened environment, you can’t. Everything needs to be configured at deployment time: JFR recording enabled via JVM flags, JMX ports exposed, volumes mounted for output files. If you didn’t configure it before the incident, it’s probably too late.
For cases where you absolutely need native profiling (CPU sampling at system level, off-CPU analysis), use targeted privilege escalation. A sidecar container with CAP_SYS_PTRACE can profile the main container via shareProcessNamespace. Ephemeral debug containers in Kubernetes 1.23+ provide similar capabilities. These are temporary, targeted, and don’t compromise the security of your main application.
Key principles:
- Enable JFR at deployment time -
-XX:StartFlightRecordingin JAVA_TOOL_OPTIONS - Configure JMX for remote access to profiling tools
- Mount emptyDir volumes for profiler output in read-only filesystem environments
- Use sidecar containers for native profiling when needed
- Deploy continuous profiling (Pyroscope, Datadog) that uses JFR internally
Security and observability can coexist. You just need to plan for observability before you deploy, not during an incident.
Related Articles
- Java Native Memory OOMKilled - Memory debugging
- eBPF Run-Queue Latency - System-level profiling
Related posts
Go cgo DNS Resolution Thread Explosion: When net.LookupHost Spawns Thousands of Threads
Go application suddenly has 10,000 threads consuming all memory. The cause: cgo-based DNS resolution blocking in slow DNS environments, bypassing Go's goroutine scheduler.
Java OOMKilled With Stable Heap: Native Memory, Direct Buffers, and glibc Arenas
Heap metrics look fine, GC is happy, but the container keeps dying. The culprit: native memory from direct buffers, JNI, and glibc memory allocator fragmentation.
JVM Metaspace OOM in Kubernetes: Why MaxMetaspaceSize Alone Won't Save You
Pod OOMKilled despite MaxMetaspaceSize set. The cause: Metaspace grows outside heap, container memory limit doesn't account for it, and class unloading isn't happening.
etcd Watch Replay Storms: When Giant ConfigMaps Kill the Control Plane
The apiserver becomes 'randomly slow'. Root cause: large, frequently updated ConfigMaps trigger watch compaction, causing thousands of controllers to relist simultaneously.
Cite this article
If you reference this post, please link to the original URL and credit the author.