gRPC v Kubernetes: Prečo Service round-robin klame
gRPC a Kubernetes vyzerali jednoducho, kym load balancing nezacal klamat. “Prečo máš 10 replik a iba 1 má 90% trafficu?” Toto bolo na dashboarde po nasadení gRPC služby. Kubernetes Service round-robin predsa funguje, nie? Nie pri gRPC.
Problém: gRPC používa HTTP/2, ktorý udržiava dlhodobé spojenia. Kubernetes Service load balancing funguje na úrovni spojenia, nie requestu. Jedno spojenie = jeden pod.
Testované na: Kubernetes 1.28+, gRPC-Go 1.60+, Istio 1.20+. Reprodukované na GKE, EKS aj bare metal.
Prečo Service Round-Robin Nefunguje
HTTP/1.1 (funguje)
Client → K8s Service → Pod A (request 1)
Client → K8s Service → Pod B (request 2)
Client → K8s Service → Pod C (request 3)
Každý request = nové spojenie = nový pod.
gRPC/HTTP/2 (nefunguje)
Client → K8s Service → Pod A (connection established)
Pod A (request 1, 2, 3, 4, 5...)
Pod A (všetky requesty)
Jedno spojenie = multiplexované requesty = jeden pod.
Reprodukovateľný Lab
Server
// server/main.go
package main
import (
"context"
"log"
"net"
"os"
pb "example/grpc/proto"
"google.golang.org/grpc"
)
type server struct {
pb.UnimplementedGreeterServer
podName string
}
func (s *server) SayHello(ctx context.Context, in *pb.HelloRequest) (*pb.HelloReply, error) {
log.Printf("Received request on pod: %s", s.podName)
return &pb.HelloReply{Message: "Hello from " + s.podName}, nil
}
func main() {
podName := os.Getenv("POD_NAME")
lis, _ := net.Listen("tcp", ":50051")
s := grpc.NewServer()
pb.RegisterGreeterServer(s, &server{podName: podName})
log.Printf("Server started on pod: %s", podName)
s.Serve(lis)
}
Kubernetes Manifesty
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: grpc-server
spec:
replicas: 5
selector:
matchLabels:
app: grpc-server
template:
metadata:
labels:
app: grpc-server
spec:
containers:
- name: server
image: grpc-server:latest
ports:
- containerPort: 50051
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: grpc-server
spec:
selector:
app: grpc-server
ports:
- port: 50051
targetPort: 50051
Load Test
# ghz - gRPC benchmarking tool
ghz --insecure \
--call helloworld.Greeter/SayHello \
--total 10000 \
--concurrency 50 \
--data '{"name":"test"}' \
grpc-server:50051
# Výsledok: 90%+ requestov na jednom pode
Riešenie 1: Headless Service + Client-Side LB
Headless Service
apiVersion: v1
kind: Service
metadata:
name: grpc-server-headless
spec:
clusterIP: None # Headless!
selector:
app: grpc-server
ports:
- port: 50051
Client s DNS Resolver
// client/main.go
import (
"google.golang.org/grpc"
"google.golang.org/grpc/resolver"
_ "google.golang.org/grpc/balancer/roundrobin"
)
func main() {
// DNS resolver + round robin balancer
conn, err := grpc.Dial(
"dns:///grpc-server-headless:50051",
grpc.WithDefaultServiceConfig(`{"loadBalancingPolicy":"round_robin"}`),
grpc.WithInsecure(),
)
if err != nil {
log.Fatalf("Failed to dial: %v", err)
}
defer conn.Close()
client := pb.NewGreeterClient(conn)
// Teraz requesty idú na rôzne pody
}
Výsledky
| Metrika | ClusterIP Service | Headless + Client LB |
|---|---|---|
| Pod distribúcia | 90/5/5/0/0 | 20/20/20/20/20 |
| Latency P99 | 45ms | 12ms |
| Throughput | 5k RPS | 25k RPS |
Riešenie 2: Service Mesh (Istio)
Istio DestinationRule
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: grpc-server
spec:
host: grpc-server
trafficPolicy:
loadBalancer:
simple: ROUND_ROBIN
connectionPool:
http:
h2UpgradePolicy: UPGRADE
Výhody Istio
- Žiadne zmeny v kóde
- mTLS automaticky
- Observability (tracing, metrics)
- Traffic management (canary, circuit breaker)
Nevýhody
- Overhead (sidecar)
- Komplexita operácií
- Latency (+1-3ms)
Riešenie 3: Linkerd
# Linkerd anotácie
apiVersion: apps/v1
kind: Deployment
metadata:
name: grpc-server
annotations:
linkerd.io/inject: enabled
spec:
# ...
Linkerd automaticky detekuje gRPC a aplikuje per-request load balancing.
Riešenie 4: Envoy ako Sidecar
# envoy-sidecar.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: envoy-config
data:
envoy.yaml: |
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 8080
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
codec_type: AUTO
stat_prefix: ingress_http
route_config:
virtual_hosts:
- name: backend
domains: ["*"]
routes:
- match: { prefix: "/" }
route:
cluster: grpc_backend
http_filters:
- name: envoy.filters.http.router
clusters:
- name: grpc_backend
type: STRICT_DNS
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
load_assignment:
cluster_name: grpc_backend
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: grpc-server-headless
port_value: 50051
Monitoring gRPC Distribution
Prometheus Metrics
# Requesty per pod
sum(rate(grpc_server_handled_total[5m])) by (pod)
# Distribúcia %
sum(rate(grpc_server_handled_total[5m])) by (pod)
/ ignoring(pod)
sum(rate(grpc_server_handled_total[5m]))
Očakávané vs Skutočné
5 podov, rovnomerná záťaž:
- Očakávané: 20% / 20% / 20% / 20% / 20%
- Bez client LB: 85% / 5% / 5% / 3% / 2%
- S client LB: 19% / 21% / 20% / 20% / 20%
Production Checklist
## gRPC Load Balancing Checklist
### Základné
- [ ] Headless Service pre gRPC
- [ ] Client-side load balancing alebo mesh
- [ ] Connection pooling s max lifetime
- [ ] Keepalive nastavenia
### Client Config
- [ ] `loadBalancingPolicy: round_robin`
- [ ] DNS resolver (`dns:///`)
- [ ] Keepalive: 30s interval, 10s timeout
- [ ] Max connection age: 5m
### Server Config
- [ ] MaxConnectionAge: 5m
- [ ] MaxConnectionAgeGrace: 10s
- [ ] Keepalive enforcement
### Monitoring
- [ ] Per-pod request distribution
- [ ] Connection count per pod
- [ ] Latency per pod
- [ ] Alert: nerovnomerná distribúcia
Záver
gRPC v Kubernetes vyžaduje špeciálnu pozornosť pre load balancing:
- Kubernetes Service nefunguje pre gRPC round-robin
- Headless Service + client LB je najjednoduchšie riešenie
- Service mesh (Istio/Linkerd) pre komplexnejšie scenáre
- Monitoring distribúcie je kritický
FAQ
Čo ak nemôžem meniť klienta?
Použi service mesh (Istio/Linkerd) alebo Envoy ako sidecar proxy.
Je client-side LB bezpečný?
Áno, ale potrebuješ pravidelne refreshovať DNS (max connection age).
Koľko spojení na pod?
Typicky 1-5 pre moderné gRPC clients. Viac = overhead bez benefitu.
Súvisiace články
- K8s Connection Storm - Connection management pri rolloutoch
- CI/CD pre monorepo - Testovanie gRPC služieb
Súvisiace články
gRPC Deadline Propagácia: Prevencia Kaskádových Zlyhaní
Frontend sa vzdá po 5s ale backend pracuje ďalších 30s. Bez deadline propagácie mrháte resources na odsúdené requesty. Ukážem ako to implementovať v Go.
JVM Native Memory v Kubernetes: Prečo Pod Dostane OOMKilled s 50% Heap
Heap je 50% plný ale pod dostane OOMKilled. Ukážem ako sledovať native memory (Metaspace, threads, NIO) a zabrániť container memory problémom.
Linux Page Cache Thrashing v Kontajneroch: Keď Voľná Pamäť Nie Je Voľná
Váš kontajner má 2GB voľné ale beží pomaly. Page cache sa počíta proti memory limitu. File I/O vytláča code pages. Vysvetlím s benchmarkmi a riešeniami.
Tail-based sampling v OpenTelemetry: Sizing, pamäťové pády a cost model
Praktický sizing guide pre tail sampling v OpenTelemetry Collector. Od decision_wait cez memory limity až po cost-benefit analýzu.
Citujte tento článok
Ak na článok odkazujete, pridajte pôvodnú URL a uveďte autora.