Working with Buildkite Builds
Overview
This skill provides workflows and tools for working with Buildkite CI builds. It covers checking status, investigating failures, and reproducing issues locally rather than creating or configuring pipelines. Use this skill when working with Buildkite builds, especially for PR workflows, post-push monitoring, failure investigation, and local reproduction.
Why bktide snapshot?
One command, one URL, gets you everything: build metadata, annotations, and logs for failed steps, all saved to local files you can grep and re-read without burning API calls. The other tools require you to piece together multiple calls and keep track of job UUIDs vs step IDs.
When to Use This Skill
Use this skill when:
- Checking CI status for the current branch or PR
- Investigating why a build failed
- Monitoring builds after a git push
- Waiting for builds to complete
- Checking build status across multiple repos/PRs
- Understanding what "broken" or other Buildkite states mean
Tool Hierarchy and Selection
CRITICAL: Always use Buildkite-native tools. Never fall back to GitHub tools (gh pr view, GitHub API, etc.) - they only show summaries and lose critical information (annotations, logs, real-time updates, state distinctions).
Use tools in this priority order:
Primary: bktide snapshot (Use This First)
Command: npx bktide@latest snapshot <buildkite-url>
The snapshot command is the preferred approach for investigating builds. It:
- Parses any Buildkite URL automatically (build URL, step URL, etc.)
- Downloads build metadata, annotations, and logs for failed steps
- Saves everything to structured files for easy analysis
- Provides actionable next-step commands
npx bktide@latest snapshot https://buildkite.com/org/pipeline/builds/123
Run bktide snapshot --help for all options, or bktide prime for detailed LLM-friendly usage guidance.
Output structure:
./tmp/bktide/snapshots/<org>/<pipeline>/<build>/
├── manifest.json # Step index with states and exit codes
├── build.json # Full build metadata
├── annotations.json # Build annotations
└── steps/
├── 01-step-name/
│ ├── log.txt # Full log output
│ └── step.json # Step metadata
└── 02-another-step/
└── ...
Useful follow-up commands (shown by snapshot):
# List failures
jq -r '.steps[] | select(.state == "failed") | "\(.id): \(.label)"' ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/manifest.json
# View a log
cat ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/steps/<step-id>/log.txt
# Search for errors across all logs
grep -r "Error\|Failed\|Exception" ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/steps/
Secondary: Other bktide Commands
For quick queries without full snapshot:
npx bktide@latest pipelines <org> # List pipelines
npx bktide@latest builds <org>/<pipeline> # List builds
npx bktide@latest build <org>/<pipeline>#<build> # Get build details
npx bktide@latest annotations <org>/<pipeline>#<build> # Show annotations
Tertiary: MCP Tools (Fallback)
When: bktide unavailable, or need programmatic access (wait_for_build, unblock)
Available MCP tools:
buildkite:get_build- Get detailed build informationbuildkite:list_builds- List builds for a pipelinebuildkite:list_annotations- Get annotations for a buildbuildkite:get_pipeline- Get pipeline configurationbuildkite:list_pipelines- List all pipelines in an orgbuildkite:wait_for_build- Wait for a build to complete (useful for monitoring)buildkite:get_logs- Retrieve job logsbuildkite:get_logs_info- Get log metadatabuildkite:list_artifacts- List build artifacts
Tool Capability Matrix
| Capability | bktide snapshot | bktide CLI | MCP Tools |
|---|---|---|---|
| Parse any BK URL | ✅ | ❌ | ❌ |
| Get build details | ✅ | ✅ | ✅ |
| Get annotations | ✅ | ✅ | ✅ |
| Retrieve logs | ✅ | ❌ | ✅ |
| Save to files | ✅ | ❌ | ❌ |
| Wait for build | ❌ | ❌ | ✅ |
| Unblock jobs | ❌ | ❌ | ✅ |
This tool preference order can be overridden via
~/.config/pickled-claude-plugins/buildkite.yml. A PreToolUse hook enforces your preference by interceptingbkCLI commands that overlap with bktide capabilities.
When Tools Fail: Fallback Hierarchy
If bktide fails:
- ✅ Use equivalent MCP tool
- ❌ Do NOT fall back to GitHub tools
If MCP tools fail:
- ✅ Check MCP server connection status
- ✅ Restart MCP connection
- ✅ Report the MCP failure to your human partner
- ❌ Do NOT fall back to GitHub tools
Critical: One tool failing does NOT mean the entire skill is invalid. Move to fallback tools, don't abandon Buildkite tools.
Core Workflows
1. Investigating a Build from URL (Most Common)
When a user provides a Buildkite URL for a failing build, use bktide snapshot to gather all context.
Step 1: Capture the build snapshot
npx bktide@latest snapshot <buildkite-url>
This works with any Buildkite URL format:
- Build URL:
https://buildkite.com/org/pipeline/builds/12345 - Step URL:
https://buildkite.com/org/pipeline/builds/12345/steps/canvas?sid=019a5f...
The snapshot command will:
- Parse the URL automatically
- Download build metadata, annotations, and logs
- Save everything to
./tmp/bktide/snapshots/<org>/<pipeline>/<build>/ - Show a summary and helpful next-step commands
Step 2: Review the summary
The command output shows:
- Build state (passed/failed/running)
- Step counts (how many passed/failed/broken)
- Snapshot location
Step 3: Filter to root failures
Most builds have 1-3 root failures and dozens-to-hundreds of dependent BROKEN steps. Counting BROKEN/RUNNING steps as failures sends you down rabbit holes investigating noise.
First, see the state distribution:
jq -r '.steps[].state' ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/manifest.json | sort | uniq -c | sort -rn
Only these are real failures:
- Steps with state
FINISHED(orFAILED) AND non-zeroexit_status
These are NOT failures — do not count or investigate them:
- BROKEN — Downstream dependencies that never ran. Auto-resolve when the root failure is fixed.
- RUNNING — Still in progress.
- WAITING / SCHEDULED — Haven't started yet.
- CANCELED — Manually or automatically canceled.
Find actual root failures:
jq -r '.steps[] | select((.state == "FINISHED" or .state == "FAILED") and .exit_status != null and .exit_status != 0) | "\(.id): \(.label) (exit \(.exit_status))"' ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/manifest.json
Example: A summary reads "466 steps: 43 passed, 397 failed, 361 running" — filtering reveals 1 actual failure (e.g. codeownership validation) and 396 BROKEN dependents. Fix the root and everything else passes.
Step 4: Read the failing step's log
# View a specific log
cat ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/steps/<step-id>/log.txt
# Search for errors across all logs
grep -r "Error\|Failed\|Exception" ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/steps/
Step 5: Analyze error output
Look for:
- Stack traces
- Test failure messages
- Exit codes and error messages
- File paths and line numbers
Step 6: Reproduce locally
Follow the "Reproducing Build Failures Locally" workflow below to:
- Extract the exact command CI ran (visible in the log)
- Translate it to a local equivalent
- Triage if local reproduction isn't feasible
2. Retrieving Job Logs
Preferred: Use bktide snapshot (see workflow 1)
The snapshot command automatically downloads logs for failed/broke