Docker & Containers: Production Infrastructure
Write, review, and architect Dockerfiles, Compose stacks, and container workflows - from single-service dev setups to multi-arch production pipelines with image signing and compliance gates. The goal is minimal, secure, reproducible images that a team can maintain and a QSA can audit.
Target versions: May 2026 snapshot. Read references/target-versions.md before
pinning Docker, Compose, BuildKit, containerd, Podman, Buildah, or runc.
This skill covers Dockerfiles, Compose, container hardening, supply chain, registry/CI patterns, and runtime migration across Docker, Podman, Buildah, Skopeo, and containerd.
When to use
- Writing or reviewing Dockerfiles (single or multi-stage)
- Setting up Docker Compose stacks (dev, staging, production)
- Optimizing image size, build speed, or layer caching
- Hardening containers for production or compliance
- Setting up image signing, SBOM generation, or vulnerability scanning
- Containerizing AI/ML workloads (Model Runner, GPU passthrough, model serving)
- Migrating from Docker to Podman or building with Buildah
- Reviewing container security posture for PCI-DSS 4.0 or SOC 2
- Troubleshooting container networking, volume, or build issues
- Using
docker initto scaffold a new project
When NOT to use
- Kubernetes manifests, Helm charts, cluster architecture (use kubernetes)
- CI/CD pipeline design (use ci-cd)
- Security audits of application code (use security-audit)
- Infrastructure provisioning with Terraform (use terraform)
AI Self-Check
AI tools consistently produce the same Docker mistakes. Before returning any generated Dockerfile or Compose file, verify against this list:
- Multi-stage build used when the app has a build step (TypeScript, Go, Rust, Java, C/C++)
- Dependencies copied and installed BEFORE source code (layer caching)
- Final image is slim/distroless/scratch - no build tools, no package caches
-
USERdirective present - container does NOT run as root - No secrets in
ENV,ARG, orCOPY- use--mount=type=secretor runtime injection - Base image pinned to specific version or SHA256 digest (never
:latestexcept Chainguard free tier, never bare:22) -
HEALTHCHECKpresent for production images -
.dockerignoreexists and excludes.git,node_modules,.env,__pycache__, etc. - No
ADDfor local files (useCOPY-ADDauto-extracts and fetches URLs) - Compose: no
version:field (deprecated since Compose v2, removed in spec v5) - Compose:
depends_onusescondition: service_healthy, not bare ordering - Compose: resource limits set on production services
- Package caches cleaned in same layer:
--no-cache(apk),rm -rf /var/lib/apt/lists/*(apt). For pip: use--mount=type=cacheOR--no-cache-dir, not both. - CMD uses exec form (JSON array), not shell form:
CMD ["node", "app.js"]notCMD node app.js - HEALTHCHECK uses available tools: probe command uses a binary present in the final image (wget in Alpine, curl in Debian, none in scratch/distroless - use the app's own health endpoint)
- Current source checked: dated versions, CLI flags, API names, and support windows are verified against primary docs before repeating them
- Hidden state identified: local config, credentials, caches, contexts, branches, cluster targets, or previous runs are made explicit before acting
- Verification is real: final checks exercise the actual runtime, parser, service, or integration point instead of only linting prose or happy paths
- Routing overlap checked: overlapping skills, trigger terms, and "When NOT to use" boundaries are checked before returning guidance
- Spec claims verified: claims about tool behavior, output contracts, or repo conventions are checked against current docs, scripts, or skill files
- Engine/Compose syntax checked: Dockerfile, Compose, BuildKit, and runtime flags match the installed versions
- Image provenance considered: base images, registries, tags, SBOMs, and signatures are handled where risk warrants
Performance
- Order Dockerfile layers from stable to volatile and use cache mounts for package-manager caches where BuildKit is available.
- Keep build contexts small with
.dockerignore; accidental monorepo contexts dominate build time. - Use multi-stage builds and slim runtime images, but measure startup and debug needs before stripping tools aggressively.
Best Practices
- Pin base image digests for sensitive workloads and track rebuild cadence for security updates.
- Run as non-root and drop capabilities unless the workload genuinely needs them.
- Preview prune and volume-removal commands; persistent data must never be collateral cleanup.
Workflow
Step 1: Determine the domain
Based on the request:
- "Write a Dockerfile" / "containerize this app" -> Dockerfile
- "Set up docker compose" / "multi-service stack" -> Compose
- "Harden this" / "make PCI compliant" / "scan for vulnerabilities" -> Security
- "Sign images" / "generate SBOM" / "CI pipeline" -> Registry & CI
- "Use Podman" / "rootless containers" / "daemonless builds" -> Runtimes
- "Review this Dockerfile/compose" -> Apply production checklist + AI self-check
Step 2: Gather context
Before writing anything, determine:
- Application type: language, framework, build system
- Runtime: Bun, Node.js, Python, Go, Rust, Java - determines base image and build pattern
- Environment: dev (hot reload, debug) vs production (minimal, hardened)
- Base image: Alpine (small, musl) vs Debian-slim (glibc compat) vs distroless (no shell) vs Chainguard (zero-CVE) vs scratch (static binaries)
- Secrets: how are they injected? (env vars, mounted files, Docker secrets, vault)
- Compliance: PCI CDE? Regulated? What scanning/signing is required?
- Target registry: Docker Hub, GHCR, private registry, OCI-compliant?
- AI/ML: GPU workload? Model serving? Docker Model Runner?
Step 3: Build
Follow the domain-specific section below. Always apply the production checklist (Step 4) and AI self-check before finishing.
Step 4: Validate
# Dockerfile
docker build --no-cache -t test-build .
docker history test-build --format "{{.Size}}\t{{.CreatedBy}}" | head -15
docker scout quickview test-build # vulnerability overview
docker scout cves test-build # detailed CVE list
# Compose
docker compose config # validate and render
docker compose --dry-run up # dry-run startup (Compose v5)
# Security
docker scout cves --only-severity critical,high <image>
cosign verify --key <key> <image> # verify signature
syft <image> -o spdx-json # generate SBOM
grype <image> # vulnerability scan (alternative to Scout)
trivy image <image> # use v0.70.0+; never v0.69.4-6
Dockerfile
Read references/dockerfile-patterns.md for complete, production-ready Dockerfile templates (Node.js/Bun, Python, Go, Rust, static site) and BuildKit syntax reference.
Base image selection
- Need a shell or package manager: use slim Debian or Ubuntu bases.
- Need the smallest static runtime: use distroless or
scratch. - Need a hardened minimal userspace: use Chainguard or another verified Wolfi-style base.
- Keep builders and runtimes separate;
golang,rust, and other heavy toolchain images stay in build stages only.
See references/dockerfile-patterns.md for the actual language-by-language base recommendations and templates.
Key patterns
Multi-stage builds - the non-negotiable pattern for any compiled or transpiled language:
# syntax=docker/dockerfile:1
FROM node:22-slim AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN --mount=type=cache,target=/root/.npm \
npm ci
COPY . .
RUN npm run build && npm prune --omi