Working with Buildkite Builds

Overview

This skill provides workflows and tools for working with Buildkite CI builds. It covers checking status, investigating failures, and reproducing issues locally rather than creating or configuring pipelines. Use this skill when working with Buildkite builds, especially for PR workflows, post-push monitoring, failure investigation, and local reproduction.

Why bktide snapshot?

One command, one URL, gets you everything: build metadata, annotations, and logs for failed steps, all saved to local files you can grep and re-read without burning API calls. The other tools require you to piece together multiple calls and keep track of job UUIDs vs step IDs.

When to Use This Skill

Use this skill when:

Checking CI status for the current branch or PR
Investigating why a build failed
Monitoring builds after a git push
Waiting for builds to complete
Checking build status across multiple repos/PRs
Understanding what "broken" or other Buildkite states mean

Tool Hierarchy and Selection

CRITICAL: Always use Buildkite-native tools. Never fall back to GitHub tools (gh pr view, GitHub API, etc.) - they only show summaries and lose critical information (annotations, logs, real-time updates, state distinctions).

Use tools in this priority order:

Primary: bktide snapshot (Use This First)

Command: npx bktide@latest snapshot <buildkite-url>

The snapshot command is the preferred approach for investigating builds. It:

Parses any Buildkite URL automatically (build URL, step URL, etc.)
Downloads build metadata, annotations, and logs for failed steps
Saves everything to structured files for easy analysis
Provides actionable next-step commands

npx bktide@latest snapshot https://buildkite.com/org/pipeline/builds/123

Run bktide snapshot --help for all options, or bktide prime for detailed LLM-friendly usage guidance.

Output structure:

./tmp/bktide/snapshots/<org>/<pipeline>/<build>/
├── manifest.json      # Step index with states and exit codes
├── build.json         # Full build metadata
├── annotations.json   # Build annotations
└── steps/
    ├── 01-step-name/
    │   ├── log.txt    # Full log output
    │   └── step.json  # Step metadata
    └── 02-another-step/
        └── ...

Useful follow-up commands (shown by snapshot):

# List failures
jq -r '.steps[] | select(.state == "failed") | "\(.id): \(.label)"' ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/manifest.json

# View a log
cat ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/steps/<step-id>/log.txt

# Search for errors across all logs
grep -r "Error\|Failed\|Exception" ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/steps/

Secondary: Other bktide Commands

For quick queries without full snapshot:

npx bktide@latest pipelines <org>                    # List pipelines
npx bktide@latest builds <org>/<pipeline>            # List builds
npx bktide@latest build <org>/<pipeline>#<build>     # Get build details
npx bktide@latest annotations <org>/<pipeline>#<build>  # Show annotations

Tertiary: MCP Tools (Fallback)

When: bktide unavailable, or need programmatic access (wait_for_build, unblock)

Available MCP tools:

buildkite:get_build - Get detailed build information
buildkite:list_builds - List builds for a pipeline
buildkite:list_annotations - Get annotations for a build
buildkite:get_pipeline - Get pipeline configuration
buildkite:list_pipelines - List all pipelines in an org
buildkite:wait_for_build - Wait for a build to complete (useful for monitoring)
buildkite:get_logs - Retrieve job logs
buildkite:get_logs_info - Get log metadata
buildkite:list_artifacts - List build artifacts

Tool Capability Matrix

Capability	bktide snapshot	bktide CLI	MCP Tools
Parse any BK URL	✅	❌	❌
Get build details	✅	✅	✅
Get annotations	✅	✅	✅
Retrieve logs	✅	❌	✅
Save to files	✅	❌	❌
Wait for build	❌	❌	✅
Unblock jobs	❌	❌	✅

This tool preference order can be overridden via ~/.config/pickled-claude-plugins/buildkite.yml. A PreToolUse hook enforces your preference by intercepting bk CLI commands that overlap with bktide capabilities.

When Tools Fail: Fallback Hierarchy

If bktide fails:

✅ Use equivalent MCP tool
❌ Do NOT fall back to GitHub tools

If MCP tools fail:

✅ Check MCP server connection status
✅ Restart MCP connection
✅ Report the MCP failure to your human partner
❌ Do NOT fall back to GitHub tools

Critical: One tool failing does NOT mean the entire skill is invalid. Move to fallback tools, don't abandon Buildkite tools.

Core Workflows

1. Investigating a Build from URL (Most Common)

When a user provides a Buildkite URL for a failing build, use bktide snapshot to gather all context.

Step 1: Capture the build snapshot

npx bktide@latest snapshot <buildkite-url>

This works with any Buildkite URL format:

Build URL: https://buildkite.com/org/pipeline/builds/12345
Step URL: https://buildkite.com/org/pipeline/builds/12345/steps/canvas?sid=019a5f...

The snapshot command will:

Parse the URL automatically
Download build metadata, annotations, and logs
Save everything to ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/
Show a summary and helpful next-step commands

Step 2: Review the summary

The command output shows:

Build state (passed/failed/running)
Step counts (how many passed/failed/broken)
Snapshot location

Step 3: Filter to root failures

Most builds have 1-3 root failures and dozens-to-hundreds of dependent BROKEN steps. Counting BROKEN/RUNNING steps as failures sends you down rabbit holes investigating noise.

First, see the state distribution:

jq -r '.steps[].state' ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/manifest.json | sort | uniq -c | sort -rn

Only these are real failures:

Steps with state FINISHED (or FAILED) AND non-zero exit_status

These are NOT failures — do not count or investigate them:

BROKEN — Downstream dependencies that never ran. Auto-resolve when the root failure is fixed.
RUNNING — Still in progress.
WAITING / SCHEDULED — Haven't started yet.
CANCELED — Manually or automatically canceled.

Find actual root failures:

jq -r '.steps[] | select((.state == "FINISHED" or .state == "FAILED") and .exit_status != null and .exit_status != 0) | "\(.id): \(.label) (exit \(.exit_status))"' ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/manifest.json

Example: A summary reads "466 steps: 43 passed, 397 failed, 361 running" — filtering reveals 1 actual failure (e.g. codeownership validation) and 396 BROKEN dependents. Fix the root and everything else passes.

Step 4: Read the failing step's log

# View a specific log
cat ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/steps/<step-id>/log.txt

# Search for errors across all logs
grep -r "Error\|Failed\|Exception" ./tmp/bktide/snapshots/<org>/<pipeline>/<build>/steps/

Step 5: Analyze error output

Look for:

Stack traces
Test failure messages
Exit codes and error messages
File paths and line numbers

Step 6: Reproduce locally

Follow the "Reproducing Build Failures Locally" workflow below to:

Extract the exact command CI ran (visible in the log)
Translate it to a local equivalent
Triage if local reproduction isn't feasible

2. Retrieving Job Logs

Preferred: Use bktide snapshot (see workflow 1)

The snapshot command automatically downloads logs for failed/broke

investigating-builds

How to add

Drop this on your repo README

Related skills

pdf

pptx

canvas-design

theme-factory

Get new Documentos skills every Monday