Paths: File paths (
references/,../ln-*) are relative to this skill directory.
MANDATORY READ: Load references/ci_tool_detection.md — compact output flags, pipefail, and failure-artifact policy for bash/curl/Puppeteer scripts.
Inputs
| Input | Required | Source | Description |
|---|---|---|---|
storyId | Yes | args, git branch, kanban, user | Story to process |
Resolution: Story Resolution Chain. Status filter: To Review
Manual Tester
Type: L3 Worker
Manually verifies Story AC on running code and reports structured results for the quality gate.
Purpose & Scope
- Create executable test scripts in
tests/manual/folder of target project. - Run AC-driven checks via bash/curl (API) or puppeteer (UI).
- Save scripts permanently for regression testing (not temp files).
- Document results via the configured tracker provider (
addComment) with pass/fail per AC and script path. - No status changes or task creation.
When to Use
- Use when a Story needs hands-on acceptance-criteria verification before automated planning
- Research comment "## Test Research:" exists on Story (from ln-521)
- All implementation tasks in Story status = Done
Test Design Principles
1. Fail-Fast - No Silent Failures
CRITICAL: Tests MUST return 1 (fail) immediately when any criterion is not met.
Never use: print_status "WARN" + return 0 for validation failures, graceful degradation without explicit flags, silent fallbacks that hide errors.
Exceptions (WARN is OK): Informational warnings that don't affect correctness, optional features (with clear justification in comments), infrastructure issues (e.g., missing Nginx in dev environment).
2. Expected-Based Testing - The Golden Standard
CRITICAL: Tests MUST compare actual results against expected reference files, not apply heuristics or algorithmic checks.
Directory structure:
tests/manual/NN-feature/
├── samples/ # Input files
├── expected/ # Expected output files (REQUIRED!)
│ └── {base_name}_{source_lang}-{target_lang}.{ext}
└── test-*.sh
Heuristics acceptable ONLY for: dynamic/non-deterministic data (timestamps, UUIDs, tokens - normalize before comparison; JSON with unordered keys - use jq --sort-keys).
3. Results Storage
Test results saved to tests/manual/results/ (persistent, in .gitignore). Named: result_{ac_name}.{ext} or response_{ac_name}.json. Inspectable after test completion for debugging.
4. Expected File Generation
To create expected files:
- Run test with current implementation
- Review output in
results/folder - If correct: copy to
expected/folder with proper naming - If incorrect: fix implementation first, then copy
IMPORTANT: Never blindly copy results to expected. Always validate correctness first.
Workflow
Phase 0: Resolve Inputs
MANDATORY READ: Load references/input_resolution_pattern.md
- Resolve storyId: Run Story Resolution Chain per guide (status filter: [To Review]).
Phase 1: Setup tests/manual structure
- Read
docs/project/infrastructure.md— get port allocation, service endpoints, base URLs. Readdocs/project/runbook.md— get Docker commands, test prerequisites, environment setup - Check if
tests/manual/folder exists in project root - If missing, create structure:
tests/manual/config.sh— shared configuration (BASE_URL, helpers, colors)tests/manual/README.md— folder documentation (see README.md template below)tests/manual/test-all.sh— master script to run all test suites (see test-all.sh template below)tests/manual/results/— folder for test outputs (add to.gitignore)
- Add
tests/manual/results/to project.gitignoreif not present - If exists, read existing
config.shto reuse settings (BASE_URL, tokens)
Phase 2: Create Story test script
- Fetch Story, parse AC into Given/When/Then list (3-5 expected)
- Check for research comment (from ln-521-test-researcher) — incorporate findings into test cases
- Detect API vs UI (API → curl, UI → puppeteer). IF UI: MANDATORY READ: Load
references/puppeteer_patterns.md - Create test folder structure:
tests/manual/{NN}-{story-slug}/samples/— input files (if needed)tests/manual/{NN}-{story-slug}/expected/— expected output files (REQUIRED for deterministic tests)
- Generate test script:
tests/manual/{NN}-{story-slug}/test-{story-slug}.sh- Use appropriate template: TEMPLATE-api-endpoint.sh (direct calls) or TEMPLATE-document-format.sh (async jobs)
- Header: Story ID, AC list, prerequisites
- Test function per AC + edge/error cases
- diff-based validation against expected files (PRIMARY)
- Results saved to
tests/manual/results/ - Summary table with timing
- Make script executable (
chmod +x)
Phase 3: Update Documentation
- Update
tests/manual/README.md:- Add new test to "Available Test Suites" table
- Include Story ID, AC covered, run command
- Update
tests/manual/test-all.sh:- Add call to new script in SUITES array
- Maintain execution order (00-setup first, then numbered suites)
Phase 4: Execute and report
MANDATORY READ: Load references/test_result_format_v1.md
- Rebuild Docker containers (no cache), ensure healthy
- Run generated script, capture output
- Parse results (pass/fail counts)
- Post tracker comment (
addComment, per test_result_format_v1.md) with:- AC matrix (pass/fail per AC)
- Script path:
tests/manual/{NN}-{story-slug}/test-{story-slug}.sh - Rerun command:
cd tests/manual && ./{NN}-{story-slug}/test-{story-slug}.sh
Critical Rules
- Scripts saved to project
tests/manual/, NOT temp files. - Rebuild Docker before testing; fail if rebuild/run unhealthy.
- Keep language of Story (EN/RU) in script comments and tracker comment.
- No fixes or status changes; only evidence and verdict.
- Script must be idempotent (can rerun anytime).
Runtime Summary Artifact
MANDATORY READ: Load references/test_planning_summary_contract.md, references/test_planning_worker_runtime_contract.md
Runtime profile:
- family:
test-planning-worker - worker:
ln-522 - summary kind:
test-planning-worker - payload fields used by coordinators:
worker,status,warnings,manual_result_path
Invocation rules:
- standalone: omit
runIdandsummaryArtifactPath - managed: pass both
runIdand exactsummaryArtifactPath - always write the validated summary before terminal outcome
Test scripts always go to tests/manual/, never to the project root.
Monitor Integration (Claude Code 2.1.98+)
MANDATORY READ: Load references/monitor_integration_pattern.md
When running test scripts expected to take >30 seconds:
Monitor(command="bash tests/manual/{suite}/test-{slug}.sh 2>&1", timeout_ms=300000, description="manual test: {slug}")
Fallback: if Monitor is unavailable (Bedrock/Vertex), use Bash(run_in_background=true).
Definition of Done
-
tests/manual/structure exists (config.sh, README.md, test-all.sh, results/ created if missing). -
tests/manual/results/added to project.gitignore. - Test script created at
tests/manual/{NN}-{story-slug}/test-{story-slug}.sh. -
expected/folder created with at least 1 expected file per deterministic AC. - Script uses diff-based validation against expected files (not heuristics).
- Script saves results to
tests/manual/results/for debugging. - Script is executable and idempotent.
- README.md updated with new test suite in "Available Test Suites" table.
- test-all.sh updated with call to new script in SUITES array.
- App rebuilt and running; tests executed.
- Verdict and tracker comment posted with script path and rerun command.
Script Templates
README.md (created once per project)
# Manual Testing Scripts
> **SCOPE: