Kubernetes Graceful Shutdown as a Contract: Zero 502s During Rollouts (HTTP + gRPC)
A reproducible way to eliminate rollout 502/ECONNRESET: readiness-driven draining, preStop, SIGTERM handling, and a measurable drain budget.
5 posts
A reproducible way to eliminate rollout 502/ECONNRESET: readiness-driven draining, preStop, SIGTERM handling, and a measurable drain budget.
Frontend gives up after 5s but backend keeps working for 30s. Without deadline propagation, you waste resources on doomed requests. I show how to implement it in Go.
Why one pod has 90% of traffic with gRPC. Reproducible lab, solutions from client-side LB to service mesh, and production checklist.
How to safely evolve Protobuf schemas in event-driven systems. Rules for .proto files, upcaster pattern and backward compatibility.
gRPC connections randomly close with 'transport is closing'. The cause: client and server keepalive settings don't match, causing the server to terminate idle connections.