Feature Flags Without Tech Debt: Automatic Stale Flag Detection
A forgotten flag is quiet right up to the moment it bites. “Leave that flag there, we might need it later.” Three years later, the code has 200+ feature flags, half of which nobody knows what they do and the other half has been permanently enabled in production for 18 months.
Feature flags are an excellent tool for progressive rollouts and A/B tests. But without discipline, they become parasites that multiply code complexity and obscure logic.
Tested on: TypeScript/Node.js codebases, LaunchDarkly and Unleash. Principles apply to any flag system.
Why Feature Flags Grow Uncontrollably
- Easy to add - 5 minutes of work
- Painful to remove - code review, testing, coordination
- Fear of rollback - “what if we need it again?”
- Missing ownership - who’s responsible for cleanup?
Symptoms of the Problem
// Code after 3 years
if (featureFlags.isEnabled('new_checkout_flow')) { // "new" from 2021
if (featureFlags.isEnabled('checkout_v2_improvements')) { // override?
if (featureFlags.isEnabled('payment_retry_logic')) { // bug fix?
// Which combination is actually the production state?
}
}
}
Framework for Flag Lifecycle Management
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ CREATE │────▶│ ACTIVE │────▶│ STALE │
│ + owner │ │ + metrics │ │ + warning │
│ + expiry │ │ + usage │ │ + removal │
└─────────────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────┐
│ REMOVED │
│ + cleanup │
│ + PR │
└─────────────┘
Step 1: Mandatory Metadata at Creation
Flag Schema
interface FeatureFlag {
key: string;
description: string;
owner: string; // Slack handle or team
createdAt: Date;
expiresAt: Date; // Required!
type: 'release' | 'experiment' | 'ops' | 'permission';
jiraTicket?: string; // Link to feature
removalPR?: string; // Auto-populated
}
CI Gate for New Flags
# .github/workflows/feature-flag-check.yml
name: Feature Flag Validation
on:
pull_request:
paths:
- 'src/flags/**'
- '**/feature-flags.json'
jobs:
validate-flags:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate flag metadata
run: |
node scripts/validate-flags.js
// scripts/validate-flags.js
const flags = require('../src/flags/feature-flags.json');
const errors = [];
for (const [key, flag] of Object.entries(flags)) {
if (!flag.owner) {
errors.push(`${key}: missing owner`);
}
if (!flag.expiresAt) {
errors.push(`${key}: missing expiresAt`);
}
if (new Date(flag.expiresAt) < new Date()) {
errors.push(`${key}: already expired, should be removed`);
}
if (!flag.description || flag.description.length < 10) {
errors.push(`${key}: description too short`);
}
}
if (errors.length > 0) {
console.error('Flag validation failed:');
errors.forEach(e => console.error(` - ${e}`));
process.exit(1);
}
console.log('All flags valid');
Step 2: Runtime Metrics
Tracking Flag Evaluations
import { Counter, Gauge } from 'prom-client';
const flagEvaluations = new Counter({
name: 'feature_flag_evaluations_total',
help: 'Total number of flag evaluations',
labelNames: ['flag_key', 'variation']
});
const flagLastEvaluation = new Gauge({
name: 'feature_flag_last_evaluation_timestamp',
help: 'Timestamp of last evaluation',
labelNames: ['flag_key']
});
class InstrumentedFlagClient {
constructor(private client: FlagClient) {}
isEnabled(key: string, context?: Context): boolean {
const result = this.client.isEnabled(key, context);
flagEvaluations.inc({
flag_key: key,
variation: String(result)
});
flagLastEvaluation.set({ flag_key: key }, Date.now());
return result;
}
}
Prometheus Alert for Unused Flags
# prometheus/rules/feature-flags.yml
groups:
- name: feature-flags
rules:
- alert: StaleFeatureFlag
expr: |
(time() - feature_flag_last_evaluation_timestamp) > 2592000 # 30 days
for: 1h
labels:
severity: warning
annotations:
summary: "Feature flag {{ $labels.flag_key }} not evaluated for 30+ days"
description: "Consider removing this flag from codebase"
- alert: FlagAlwaysSameVariation
expr: |
count by (flag_key) (
count_over_time(feature_flag_evaluations_total[30d])
) == 1
for: 1h
labels:
severity: info
annotations:
summary: "Feature flag {{ $labels.flag_key }} always returns same value"
Step 3: Static Code Analysis
Scanner: Flags in Code vs Registry
// scripts/scan-flags.ts
import * as ts from 'typescript';
import * as glob from 'glob';
import * as fs from 'fs';
interface FlagUsage {
flagKey: string;
file: string;
line: number;
}
function findFlagsInCode(sourceFile: ts.SourceFile): FlagUsage[] {
const usages: FlagUsage[] = [];
function visit(node: ts.Node) {
// Match: featureFlags.isEnabled('flag_key')
if (
ts.isCallExpression(node) &&
ts.isPropertyAccessExpression(node.expression) &&
node.expression.name.text === 'isEnabled' &&
node.arguments.length > 0 &&
ts.isStringLiteral(node.arguments[0])
) {
const flagKey = node.arguments[0].text;
const { line } = sourceFile.getLineAndCharacterOfPosition(node.getStart());
usages.push({
flagKey,
file: sourceFile.fileName,
line: line + 1
});
}
ts.forEachChild(node, visit);
}
visit(sourceFile);
return usages;
}
async function main() {
// 1. Scan all TS/JS files
const files = glob.sync('src/**/*.{ts,tsx,js,jsx}');
const allUsages: FlagUsage[] = [];
for (const file of files) {
const content = fs.readFileSync(file, 'utf-8');
const sourceFile = ts.createSourceFile(
file,
content,
ts.ScriptTarget.Latest,
true
);
allUsages.push(...findFlagsInCode(sourceFile));
}
// 2. Load registry (LaunchDarkly API / local config)
const registry = await fetchFlagRegistry();
// 3. Compare
const usedKeys = new Set(allUsages.map(u => u.flagKey));
const registeredKeys = new Set(Object.keys(registry));
const inCodeNotRegistry = [...usedKeys].filter(k => !registeredKeys.has(k));
const inRegistryNotCode = [...registeredKeys].filter(k => !usedKeys.has(k));
console.log('Flags in code but not in registry:', inCodeNotRegistry);
console.log('Flags in registry but not in code:', inRegistryNotCode);
// 4. Output for CI
if (inCodeNotRegistry.length > 0) {
console.error('ERROR: Undefined flags used in code');
process.exit(1);
}
return { usages: allUsages, orphaned: inRegistryNotCode };
}
Step 4: Automatic Removal PR
Codemod Script
// scripts/remove-flag.ts
import * as ts from 'typescript';
import * as fs from 'fs';
import { execSync } from 'child_process';
interface RemovalConfig {
flagKey: string;
finalValue: boolean; // Value to replace with
dryRun?: boolean;
}
function removeFlag(config: RemovalConfig) {
const { flagKey, finalValue, dryRun } = config;
// Find all occurrences
const files = execSync('grep -rl "' + flagKey + '" src/', { encoding: 'utf-8' })
.split('\n')
.filter(Boolean);
for (const file of files) {
const content = fs.readFileSync(file, 'utf-8');
const sourceFile = ts.createSourceFile(
file,
content,
ts.ScriptTarget.Latest,
true
);
const transformer: ts.TransformerFactory<ts.SourceFile> = (context) => {
return (rootNode) => {
function visit(node: ts.Node): ts.Node {
// Replace featureFlags.isEnabled('key') → true/false
if (
ts.isCallExpression(node) &&
ts.isPropertyAccessExpression(node.expression) &&
node.expression.name.text === 'isEnabled' &&
node.arguments.length > 0 &&
ts.isStringLiteral(node.arguments[0]) &&
node.arguments[0].text === flagKey
) {
return finalValue
? ts.factory.createTrue()
: ts.factory.createFalse();
}
return ts.visitEachChild(node, visit, context);
}
return ts.visitNode(rootNode, visit) as ts.SourceFile;
};
};
const result = ts.transform(sourceFile, [transformer]);
const printer = ts.createPrinter();
const newContent = printer.printFile(result.transformed[0]);
if (dryRun) {
console.log(`Would update ${file}`);
} else {
fs.writeFileSync(file, newContent);
}
result.dispose();
}
}
// Optimization: remove dead code after replacement
function removeDeadCode(files: string[]) {
// After flag replacement:
// if (true) { ... } → ...
// if (false) { ... } → (remove)
for (const file of files) {
execSync(`npx eslint --fix ${file}`, { stdio: 'inherit' });
}
}
GitHub Action for Auto-Removal
# .github/workflows/stale-flag-removal.yml
name: Stale Flag Removal
on:
schedule:
- cron: '0 9 * * 1' # Every Monday 9:00
jobs:
find-stale-flags:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Find expired flags
id: stale
run: |
node scripts/find-stale-flags.js > stale-flags.json
echo "flags=$(cat stale-flags.json | jq -r '.[] | .key' | head -1)" >> $GITHUB_OUTPUT
- name: Create removal PR
if: steps.stale.outputs.flags != ''
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
FLAG_KEY="${{ steps.stale.outputs.flags }}"
BRANCH="remove-flag-${FLAG_KEY}"
git checkout -b $BRANCH
# Run codemod
node scripts/remove-flag.js --key=$FLAG_KEY --value=true
# Commit and push
git add .
git commit -m "chore: remove stale feature flag ${FLAG_KEY}"
git push origin $BRANCH
# Create PR
gh pr create \
--title "Remove stale feature flag: ${FLAG_KEY}" \
--body "$(cat <<EOF
## Automatic Stale Flag Removal
Flag \`${FLAG_KEY}\` has been identified as stale:
- Last evaluation: > 30 days ago
- Expiry date: passed
- Always returns: true
This PR removes the flag and replaces all usages with \`true\`.
**Review checklist:**
- [ ] Dead code was correctly removed
- [ ] No side effects from removal
- [ ] Tests pass
EOF
)"
Dashboard: Flag Health Overview
// pages/api/flag-health.ts
import { prisma } from '@/lib/prisma';
export async function GET() {
const flags = await prisma.featureFlag.findMany({
include: { evaluations: true }
});
const health = flags.map(flag => {
const lastEval = flag.evaluations[0]?.timestamp;
const daysSinceEval = lastEval
? Math.floor((Date.now() - lastEval.getTime()) / 86400000)
: null;
const daysUntilExpiry = Math.floor(
(new Date(flag.expiresAt).getTime() - Date.now()) / 86400000
);
return {
key: flag.key,
owner: flag.owner,
status: getStatus(daysSinceEval, daysUntilExpiry),
lastEvaluation: lastEval,
expiresAt: flag.expiresAt,
evaluationCount30d: flag.evaluations.length,
alwaysSameValue: new Set(flag.evaluations.map(e => e.result)).size === 1
};
});
return Response.json(health);
}
function getStatus(daysSinceEval: number | null, daysUntilExpiry: number) {
if (daysUntilExpiry < 0) return 'expired';
if (daysSinceEval === null) return 'never-used';
if (daysSinceEval > 30) return 'stale';
if (daysUntilExpiry < 14) return 'expiring-soon';
return 'healthy';
}
Production Checklist
## Feature Flag Hygiene Checklist
### When creating a flag
- [ ] Owner is defined (team or individual)
- [ ] Expiry date is set (max 90 days for release flags)
- [ ] Description explains WHY the flag exists
- [ ] Link to JIRA/Linear ticket
### Weekly review
- [ ] Dashboard check: no expired flags
- [ ] Metrics check: no unused flags (30d)
- [ ] Ownership check: no orphaned flags
### When removing
- [ ] Codemod replaces flag with correct value
- [ ] Dead code is removed
- [ ] Tests pass
- [ ] Flag is removed from registry
- [ ] PR is reviewed
### Monitoring
- [ ] Alert on expired flags
- [ ] Alert on unused flags (30d)
- [ ] Metric: count of active flags
- [ ] Metric: average age of flags
Conclusion
Feature flags are a powerful tool, but they require discipline. Key principles:
- Ownership - every flag has an owner
- Expiry - every flag has an expiration date
- Metrics - track runtime usage
- Automation - automatic removal PR for stale flags
- Visibility - dashboard for the whole team
Without these measures, feature flags become technical debt. With them, they’re a powerful tool for safe delivery.
FAQ
What is the ideal feature flag lifespan?
- Release flags: 2-4 weeks (remove after successful rollout)
- Experiment flags: based on A/B test duration + buffer
- Ops flags: as needed, but review every 3 months
- Permission flags: can be permanent (but review)
What if I need the flag longer?
Extend expiry with clear justification. If a flag lives > 6 months, it should probably be config or permission, not a feature flag.
How to handle flags in tests?
Mock flag client in unit tests. In integration tests, use test-specific flag values. Never hardcode production flag values in tests.
Related Articles
- CI/CD for Monorepo - Integrating flag checks into pipeline
- Architecture as Code - ADR for feature flag policies
Related posts
Architectural Linting: Automated Protection Against Spaghetti Code
How to enforce architectural rules in CI/CD. Dependency Cruiser for JS/TS, ArchUnit for Java, and practical configuration examples.
Kubernetes Rollout Without DB Outage: How to Stop PostgreSQL Connection Storm
Reproducible lab demonstrating connection storm during K8s rollouts. PgBouncer, preStop hooks and jitter - practical solutions with benchmarks.
Stop Mocking Your Database: Integration Tests in the Testcontainers Era
Why mocks lie and how Testcontainers will change your testing approach. Practical examples, CI setup, and data isolation strategies.
Zero-Downtime PostgreSQL Migrations: Expand/Contract, Backfill and Rollback Strategies
A practical playbook for safe database migrations in production. From expand/contract pattern through online indexes to monitoring and rollback.
Cite this article
If you reference this post, please link to the original URL and credit the author.