Span Contracts: Trace-Driven API Contract Testing with OpenTelemetry
I wanted CI to catch breaking API changes without maintaining another spec. Green unit tests. Green lint. Green CI. And yet: some clients start failing or UI flows silently break.
The most common silent killer is not a logic bug. It is an accidental breaking change in your API:
- change
idfromnumbertostring - rename
user.emailtouser.primaryEmail - stop returning
itemsfor some statuses - or remove a field that was “optional” but is not, in practice
OpenAPI? Great, when it stays current. Pact/CDC? Great, when someone keeps contracts updated. Reality: contracts often rot because the maintenance cost is high.
This post shows a workflow that combines observability and testing:
Span Contracts = contracts derived from OpenTelemetry traces that you can use as a CI gate for breaking changes.
Trace-based testing exists (for example, Malabi). Here we focus on a very specific, very common problem: automatically detect breaking changes in request/response payload shapes without storing sensitive data.
Tested on: Node.js/TypeScript (Express/Fastify), OpenTelemetry SDK (server spans), CI: GitHub Actions/GitLab CI. Optional OTel Collector.
Why OpenTelemetry?
You already use OpenTelemetry for latency, errors, and debugging. And OTel has a property most tests do not:
It is the ground truth of what actually happens in production-like traffic.
HTTP server spans include standard attributes like http.route that identify endpoints.
And trace context is interoperable across services (traceparent / tracestate).
What is a “Span Contract”?
A Span Contract is a definition you commit to Git (as a golden file), for example:
GET /api/users/{id} -> 200POST /api/orders -> 201GET /api/orders/{id} -> 404
For each one, you store:
- the JSON shape (structure) - no values
- types for each JSON path
- (optional) required vs optional inferred from samples
That makes the contract:
- cheap to maintain (generated)
- anchored in reality (derived from traces/test runs)
- safer (no PII, just shape)
The key trick: fingerprint JSON shape without values
We need two properties:
- stable (independent of key order)
- anonymous (no values, only types + paths)
Shape normalization
A representation we can hash might look like:
$.id:string
$.name:string
$.profile:object
$.profile.age:number
$.tags:array
$.tags[]:string
TypeScript implementation
// contracts/shape.ts
import crypto from "node:crypto";
type ShapeType =
| "null"
| "string"
| "number"
| "boolean"
| "object"
| "array"
| "unknown";
function valueType(v: unknown): ShapeType {
if (v === null) return "null";
if (Array.isArray(v)) return "array";
switch (typeof v) {
case "string":
return "string";
case "number":
return "number";
case "boolean":
return "boolean";
case "object":
return "object";
default:
return "unknown";
}
}
function collectShape(v: unknown, path: string, out: Set<string>) {
const t = valueType(v);
out.add(`${path}:${t}`);
if (t === "array") {
const arr = v as unknown[];
if (arr.length === 0) {
out.add(`${path}[]:empty`);
return;
}
for (const el of arr) collectShape(el, `${path}[]`, out);
return;
}
if (t === "object") {
const obj = v as Record<string, unknown>;
for (const key of Object.keys(obj).sort()) {
collectShape(obj[key], `${path}.${key}`, out);
}
}
}
export function shapePaths(v: unknown): string[] {
const out = new Set<string>();
collectShape(v, "$", out);
return Array.from(out).sort();
}
export function shapeHash(v: unknown): string {
const norm = shapePaths(v).join("|");
return crypto.createHash("sha256").update(norm).digest("hex");
}
Keep it simple. In practice you can add:
integervsfloat- date normalization (still no values)
- a max depth limit
Step 1: add shape hash to server spans
We do not want to store raw bodies in traces (PII, cost). We want a fingerprint and a small summary.
Express middleware (example)
// contracts/otel-contract-middleware.ts
import { context, trace } from "@opentelemetry/api";
import { shapeHash } from "./shape";
export function spanContractMiddleware() {
return (req: any, res: any, next: any) => {
const span = trace.getSpan(context.active());
if (!span) return next();
const originalJson = res.json.bind(res);
res.json = (body: unknown) => {
try {
span.setAttribute("api.contract.response_shape_hash", shapeHash(body));
span.setAttribute("api.contract.status_code", res.statusCode);
span.setAttribute(
"api.contract.response_bytes",
Buffer.byteLength(JSON.stringify(body))
);
} catch (e) {
span.recordException(e as Error);
}
return originalJson(body);
};
next();
};
}
Use http.method + http.route as the endpoint key (route template).
That is part of the HTTP semantic conventions for spans.
Step 2: Collect Span Contracts - two modes
Mode A: CI (simple and practical)
In CI, run your integration/API tests and collect spans. At the end:
- extract
http.route,http.method,api.contract.* - generate
contracts.json(baseline) or a diff against the repo
Mode B: Staging/Prod (where required vs optional comes from)
Over time you can refine contracts from staging/prod samples:
- which fields are always present (required)
- which are only present sometimes (optional)
- whether there are role-based variants (admin vs user)
If you want to pull contracts from an OTel pipeline, the Collector file exporter is fine for a demo.
Minimal Collector config (file exporter)
# otel-collector.yaml
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
exporters:
file:
path: ./traces.json
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [file]
Note: JSON export field names are not guaranteed stable across versions. In CI, an in-memory exporter or a trace backend query is often more reliable.
Step 3: A contract is not just a hash - define breaking logic
A hash is great for detecting change, but it is too strict:
- add a field -> hash changes, but likely non-breaking
- remove a field -> often breaking
- change a type -> almost always breaking
So compare the path:type set and apply rules:
Practical breaking rules
Breaking:
- a required path disappears
- a type changes (
number -> string,object -> array, …)
Non-breaking:
- a new path appears (if clients tolerate extra fields)
Step 4: Infer required vs optional from real traces
With enough samples per endpoint, you can infer:
- required fields = present in almost every response
- optional fields = present only sometimes
Algorithm
For endpoint signature METHOD + ROUTE + STATUS:
- take N samples
- count
present_countper JSON path presence_rate = present_count / N- if
presence_rate >= 0.99-> required
Example contract JSON
{
"GET /api/users/{id} 200": {
"samples": 1432,
"required": {
"$.id": "string",
"$.name": "string",
"$.profile": "object"
},
"optional": {
"$.profile.avatarUrl": "string",
"$.profile.bio": "string"
}
}
}
Step 5: CI gate (fail only on breaking changes)
Your CI job:
- runs integration tests
- builds an observed contract
- compares against baseline
- fails on breaking
Example diff output
BREAKING: GET /api/users/{id} 200
- required field removed: $.profile
BREAKING: POST /api/orders 201
- type changed: $.total number -> string
NON-BREAKING: GET /api/users/{id} 200
+ new optional field: $.profile.avatarUrl string
Security: PII, GDPR, and “observability is not a dumping ground”
If you use OTel pipelines for contracts, stick to three rules:
- never put raw request/response bodies into spans
- store only shape (paths + types) or a hash + small metrics
- secure Collector config and exports (follow OTel security best practices)
Production checklist
Implementation
- compute JSON shape + hash (no values)
- add
api.contract.response_shape_hashto server spans - use
http.route+http.methodas the contract key
CI
- generate contracts from integration tests
- compare required paths (removal/type change = breaking)
- allow additive changes as non-breaking
Operations
- if using staging/prod, watch sampling bias
- handle role/variant responses (split contracts by audience)
FAQ
Is this just another snapshot test?
No. Snapshot tests store full payloads (including values), which is expensive and often unsafe. Span Contracts store only shape.
Why not just use OpenAPI?
OpenAPI is great. Span Contracts solve drift: when the spec is not reality. Ideally, use both.
Won’t this be expensive for trace storage?
Not necessarily. Hash + a few attributes are cheap. Sampling strategy still matters.
Related articles
Conclusion
The biggest win with Span Contracts is not “another test”. It is a mindset shift:
A contract is not a document. A contract is what the system actually sends.
OpenTelemetry is the easiest way to measure that reality and make it a CI gate.
References
- Trace-based testing with OpenTelemetry (Malabi): https://www.cncf.io/blog/2021/08/11/trace-based-testing-with-opentelemetry-meet-open-source-malabi/
- OpenTelemetry Collector configuration: https://opentelemetry.io/docs/collector/configuration/
- Semantic conventions for HTTP spans: https://opentelemetry.io/docs/specs/semconv/http/http-spans/
- W3C Trace Context: https://www.w3.org/TR/trace-context-2/
- OTel Collector file exporter: https://aws-otel.github.io/docs/components/misc-exporters
- OTel security best practices: https://opentelemetry.io/docs/security/config-best-practices/
Related posts
Cardinality Contracts: Prometheus Labels as an API with Budgets
Define label budgets, enforce them in CI, and add a runtime firewall to stop cardinality explosions before production.
Dash Contracts in Go: CI Compiler for Grafana Dashboards and Prometheus Alerts
Extract PromQL from dashboards and rules, verify selectors against /metrics, and fail CI before dashboards go dark.
OpenTelemetry Collector Backpressure: Fixing Drops with memory_limiter and Queues
OpenTelemetry Collector drops spans under load when exporters backpressure. Fix with memory_limiter, queues, and batch tuning, with commands to verify.
Tail-Based Sampling in OpenTelemetry: Sizing, Memory Crashes and Cost Model
Practical sizing guide for tail sampling in OpenTelemetry Collector. From decision_wait through memory limits to cost-benefit analysis.
Cite this article
If you reference this post, please link to the original URL and credit the author.