Feature Flag Guide Skill

Produce a complete feature flag management guide for a service or team — covering how flags are named and categorised, how to create and roll out a flag safely, what to monitor during rollout, when and how to clean up flags, and who is responsible for each stage. Feature flags without discipline become permanent technical debt. This guide gives the team a repeatable process so flags are created intentionally, rolled out safely, and removed when done.

Required Inputs

Ask for these if not already provided:

Service or team name — scope of the guide
Feature flag platform — LaunchDarkly, Split, Unleash, Flagsmith, Flipt, or a custom/in-house solution
Flag being documented (if writing a per-flag guide) or "general guide" (if writing team-wide policy)
Rollout constraints — any compliance, data privacy, or contractual constraints on who can see a feature (e.g. HIPAA, EU-only, enterprise customers only)

Output Format

Feature Flag Management Guide: [Service / Team Name]

Team: [Team name] | Platform: [LaunchDarkly / Split / Unleash / Custom] Document owner: [Name] | Last updated: [Date] Review cycle: Quarterly, and whenever the flag platform changes

1. Flag Taxonomy

Every flag belongs to exactly one category. The category determines default behaviour, who can enable it in production, and when it must be cleaned up.

Type	Purpose	Default state	Production gate	Max lifetime
Release flag	Controls rollout of a new feature — decouples deploy from release	Off	Tech lead approval	90 days from feature launch
Experiment flag	A/B or multivariate test — measures impact of a change	Off (control group)	Product + tech lead	Duration of experiment + 30 days
Ops flag	Operational control — circuit breaker, kill switch, throttle	On (normal behaviour)	On-call engineer can toggle	Indefinite (review annually)
Permission flag	Gates access by user segment, tier, or region	Off (restricted)	Product + Account owner	Indefinite (review annually)

When in doubt: If the flag is temporary (tied to a specific feature launch), it is a Release flag. If it will exist forever as a control knob, it is an Ops flag.

2. Flag Naming Convention

All flags must follow this naming scheme:

[type]-[service]-[feature-description]

Segment	Values	Example
type	`release`, `exp`, `ops`, `perm`	`release`
service	Short service identifier, lowercase, hyphenated	`payments`
feature-description	Kebab-case description, max 5 words	`new-checkout-flow`

Full examples:

release-payments-new-checkout-flow — release flag for a new checkout feature in the payments service
exp-search-personalized-ranking — experiment on personalized search ranking
ops-api-rate-limit-override — operational flag to override API rate limits
perm-dashboard-beta-users-only — permission flag gating dashboard for beta users

Do not:

Use ticket numbers in flag names (release-JIRA-1234 → not searchable or self-describing)
Use dates in flag names (release-dark-mode-jan-2024 → flags outlive their dates)
Use vague names (release-new-thing → not useful when you have 50 flags)

3. Flag Creation Checklist

Complete every item before creating a flag in the production environment.

Before creating the flag:

Flag type determined from taxonomy (Section 1)
Flag name follows naming convention (Section 2)
Flag owner assigned — one named engineer responsible for cleanup
Cleanup date set in the flag description field (for Release and Experiment flags)
Rollout strategy defined — see Section 4
Monitoring plan defined — see Section 5
Code review approved with flag guard in place

Flag description field (required):

Type: [Release / Experiment / Ops / Permission]
Owner: [Name]
Linked ticket: [JIRA-XXXX or GitHub issue URL]
Purpose: [One sentence — what this flag controls]
Cleanup by: [Date — required for Release and Experiment flags; "Annual review" for Ops/Permission]
Rollout plan: [Link to this document or inline summary]

Code requirements:

# Good — behaviour is clear when flag is off, and cleanup is obvious
if flag_client.is_enabled("release-[service]-[feature]", user_context):
    return new_feature_handler(request)
else:
    return existing_handler(request)

# Bad — nested flags, ternaries, and implicit defaults make cleanup error-prone
result = new_handler() if (f1 and not f2) or f3 else old_handler()

4. Rollout Strategy

Decision Tree

Use this decision tree to pick the right rollout strategy for a Release or Experiment flag:

Is the change reversible without a deploy?
├── No → Use an Ops flag with manual enable, not a percentage rollout
└── Yes → Continue

Is there a user-level identifier available (user ID, session ID)?
├── No → Use server-side percentage (stateless, but inconsistent per user)
└── Yes → Use user-based percentage (consistent experience per user) ← preferred

Is the change risky (touches payments, auth, or data writes)?
├── Yes → Start at 1% → 5% → 25% → 50% → 100%, with 24-hour holds
└── No → Start at 10% → 50% → 100%, with 4-hour holds

Does the change affect specific customer tiers or geographies?
├── Yes → Use segment-based targeting, not percentage rollout
└── No → Use percentage rollout

Rollout Stages

Stage	Percentage	Hold duration	Pass criteria before advancing
Canary	1%	24 hours	Error rate within SLO, no P1 incidents
Early rollout	5–10%	24 hours	Error rate and latency match control group
Partial rollout	25–50%	24–48 hours	Business metrics not degraded vs. control
Majority	75%	24 hours	Final check — no regressions
Full rollout	100%	48 hours	Stable — schedule cleanup

Do not skip stages for Release flags on production. Speed of rollout is not worth a production incident.

Segment-Based Targeting

Use segment targeting when the rollout must be restricted:

# LaunchDarkly segment example — adapt for your platform
targeting_rules:
  - clause:
      attribute: "subscription_tier"
      operator: "in"
      values: ["enterprise", "team"]
    serve: "on"
  - clause:
      attribute: "country"
      operator: "in"
      values: ["US", "CA", "GB"]
    serve: "on"
  default: "off"

5. Monitoring Requirements

Every flag that is not at 0% or 100% rollout requires active monitoring. Do not roll out a flag and walk away.

Required Metrics Per Flag

Metric	What to compare	Alert threshold
Error rate	Flag-on cohort vs. flag-off cohort	>2× baseline error rate in flag-on group
p99 latency	Flag-on vs. flag-off	>20% higher latency in flag-on group
[Primary business metric]	Flag-on vs. flag-off	>5% degradation in flag-on group
[Conversion / completion rate]	Flag-on vs. flag-off	>2% drop in flag-on group

Setting up split metric monitoring in [LaunchDarkly / Split / Datadog]:

1. Navigate to the flag → Metrics tab
2. Add metric: [primary business metric]
3. Add metric: error_rate (service-level)
4. Add metric: p99_latency (endpoint-level)
5. Set alert: notify [flag owner] in Slack #[team-channel] if metric degrades by [threshold]
6. Set experiment duration: [N days] if this is an Experiment flag

Guardrail Metrics

These metrics must never degrade, regardless of what the primary metric shows. If a guardrail is breached, roll back immediately — do not wait for investigation.

Error rate exceeds SLO threshold ([X]%)
p99 latency exceeds SLO threshold ([Y] ms)
[Service-specific guardrail — e.g. payment failure rate, auth failure rate]

Immediate rollback command if guardrail is breached:

# [LaunchDarkly CLI]
ld-cli flag update [project-key] [flag-key] --default-variat

feature-flag-guide

Cómo agregar

Pega en el README de tu repo

Skills relacionadas

pdf

pptx

canvas-design

theme-factory

Recibe nuevas skills de Documentos todos los lunes