Load Testing Plan Skill

Produce a complete load and performance testing plan for a service — covering test objectives, scenario definitions, tooling configuration, success thresholds, and CI integration. A good load testing plan eliminates ambiguity about what "performance is acceptable" means, so engineers can run tests and get a pass/fail answer without having to interpret raw numbers themselves.

Required Inputs

Ask for these if not already provided:

Service name and key endpoints — which endpoints are under test (path, method, typical request/response shape)
Current traffic baseline — current requests/sec, p50/p99 latency, error rate under normal load
Peak traffic expectations — expected peak RPS (e.g. 10× baseline for flash sales, or seasonality peak)
SLO targets — latency SLOs (p99 < X ms), error rate SLO (< Y%), availability target
Preferred testing tool — k6, Locust, JMeter, Gatling, or no preference
Test environment availability — dedicated load test environment, staging, or production (with traffic shaping)

Output Format

Load Testing Plan: [Service Name]

Author: [Name] | Team: [Team name] Date: [Date] | Review cycle: Before each major release and quarterly Testing tool: [k6 / Locust / JMeter / Gatling] Test environment: [Environment name and URL]

1. Objectives and Scope

What we are testing: [Service name] handles [describe function — e.g. "user authentication requests from the mobile and web clients"]. This plan validates that the service meets its SLOs under expected and elevated traffic conditions.

In scope:

[Endpoint 1: METHOD /path — description]
[Endpoint 2: METHOD /path — description]
[Endpoint 3: METHOD /path — description]

Out of scope:

[Any endpoints explicitly excluded and why — e.g. "admin APIs — low traffic, excluded from load test"]
[Third-party integrations that cannot be load-tested — mock them instead]

2. Performance Targets (Success Criteria)

Every scenario has explicit pass/fail thresholds. A test run FAILS if any threshold is breached.

Metric	Baseline scenario	Stress scenario	Spike scenario	Soak scenario
p50 latency	< [X] ms	< [X × 1.5] ms	< [X × 2] ms	< [X] ms
p95 latency	< [Y] ms	< [Y × 1.5] ms	< [Y × 2] ms	< [Y] ms
p99 latency	< [Z] ms	< [Z × 2] ms	< [Z × 3] ms	< [Z] ms
Error rate	< [0.1]%	< [1]%	< [2]%	< [0.1]%
Throughput	≥ [N] RPS	≥ [N × 3] RPS	N/A	≥ [N] RPS
Failed requests	0 (5xx)	< [threshold]	< [threshold]	0 (5xx)

SLO reference: These thresholds are derived from the service SLOs — p99 < [Z ms], error rate < [0.1]%, availability [99.9]%.

3. Traffic Model

Baseline traffic (current production):

Average RPS: [N] req/sec
Peak RPS (observed): [N] req/sec
Request distribution by endpoint:
- [Endpoint 1]: [X]% of traffic
- [Endpoint 2]: [Y]% of traffic
- [Endpoint 3]: [Z]% of traffic

Simulated user behaviour:

Think time between requests: [X–Y] seconds (randomised)
Session duration: [N] minutes average
Authenticated vs anonymous ratio: [X]%/[Y]%
Geographic distribution: [Region 1 X]%, [Region 2 Y]%

4. Test Scenarios

Scenario 1: Baseline (Steady-State)

Purpose: Confirm the service performs acceptably under normal production load. Duration: 10 minutes Load profile: Ramp to [N] RPS over 2 minutes, hold for 8 minutes. Concurrency: [N] virtual users

Pass criteria: All thresholds in the Baseline column of the targets table above.

Scenario 2: Stress Test

Purpose: Find the breaking point — how much load can the service handle before SLOs are breached? Duration: 20–30 minutes Load profile: Ramp from [N] RPS (baseline) to [N × 5] RPS in 5-minute steps. Hold each step for 5 minutes. Stop at first SLO breach. Concurrency: Scales with RPS target

What to record:

RPS at which p99 latency first exceeds SLO
RPS at which error rate first exceeds SLO
Whether the service recovers when load drops back to baseline

Scenario 3: Spike Test

Purpose: Simulate a sudden traffic surge (flash sale, viral event, bot attack). Duration: 15 minutes Load profile: Hold at [N] RPS (baseline) for 3 minutes, spike to [N × 10] RPS instantly, hold for 5 minutes, drop back to baseline for 7 minutes.

What to record:

Latency during spike and recovery
Whether the service sheds load gracefully (rate limiting, queue depth)
Time to recover to baseline latency after spike ends

Scenario 4: Soak / Endurance Test

Purpose: Detect memory leaks, connection pool exhaustion, and slow degradation over time. Duration: 4–8 hours (run overnight) Load profile: Steady [N × 1.5] RPS (50% above baseline) for entire duration.

What to watch:

Memory usage trend over time (should not grow unboundedly)
Error rate trend (should be flat, not creeping up)
GC pause frequency (JVM/Go services)
Database connection pool utilisation
p99 latency trend (should not creep up over hours)

5. Test Environment Requirements

Infrastructure

Component	Requirement	Notes
Service under test	Isolated from production	[N] replicas, matching prod resource limits
Database	Separate instance with production-scale data	Seed script in section 7
Cache (Redis/Memcached)	Empty at test start	Ensures cold-start conditions are tested
Load generator	Separate from service under test	[N] vCPUs, [N] GB RAM minimum
Network	Low-latency path to service	Do not run generator on same host

Data Seeding

Before every test run, ensure the environment has:

# Seed test users (needed for authenticated endpoint tests)
[seed command or script path — e.g. python scripts/seed_load_test_users.py --count 10000]

# Seed test data for read endpoints
[seed command — e.g. ./scripts/seed_products.sh --count 50000]

# Verify seed completed
[verification command — e.g. psql $DB_URL -c "SELECT COUNT(*) FROM users WHERE load_test=true"]

Test data rules:

Never use real production user data in load tests
Tag all test-generated records with load_test=true for easy cleanup
Run cleanup after each test: [cleanup command]

6. Tooling Setup

k6 Script Skeleton

import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';

// Custom metrics
const errorRate = new Rate('error_rate');
const endpointLatency = new Trend('endpoint_latency', true);

// Test configuration — override per scenario
export const options = {
  scenarios: {
    baseline: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '2m', target: [BASELINE_VUS] },
        { duration: '8m', target: [BASELINE_VUS] },
        { duration: '1m', target: 0 },
      ],
    },
  },
  thresholds: {
    http_req_duration: [
      'p(95)<[Y_MS]',
      'p(99)<[Z_MS]',
    ],
    error_rate: ['rate<0.01'],
    http_req_failed: ['rate<0.01'],
  },
};

// Auth helper — get token once per VU
export function setup() {
  const loginRes = http.post('[BASE_URL]/auth/login', JSON.stringify({
    username: `load_test_user_${Math.floor(Math.random() * 10000)}@example.com`,
    password: '[LOAD_TEST_PASSWORD]',
  }), { headers: { 'Content-Type': 'application/json' } });

  check(loginRes, { 'login ok': (r) => r.status === 200 });
  return { token: loginRes.json('access_token') };
}

export default function (data) {
  const headers = {
    Authorization: `Bearer ${data.token}`,
    'Content-Type': 'application/json',
  };

  // Endpoint 1: [Description]
  const res1 = http.get('[BASE_URL]/[endpoint-1]', { headers });
  check(res1, {
    '[endpoint-1] status 200': (r) => r.status === 200,
    '[endpoint-1] latency < [X]ms': (r) => r.timings.duration < [X],
  });
  errorRate.add(res1.status >= 400);
  endpoint

load-testing-plan

How to add

Drop this on your repo README

Related skills

pdf

pptx

docx

canvas-design

Get new Documentos skills every Monday