Terraform & OpenTofu: Production Infrastructure-as-Code

Write, review, and architect Terraform/OpenTofu infrastructure - from individual resources to multi-account, PCI-compliant platform architectures. The goal is reproducible, drift-free, auditable infrastructure that passes both peer review and QSA assessment.

Target versions (May 2026): Terraform 1.14.9 (IBM/HashiCorp, BSL; 1.15.0-rc2 in progress), OpenTofu 1.11.6 (Linux Foundation, MPL). Helm provider v3.1+, K8s provider v3.0+, AWS provider v6.x, Azure v4.x, GCP v7.x.

This skill covers HCL, modules, operations, state, CI/CD, policy-as-code, audit trails, PCI-DSS 4.0 controls, drift detection, and CDE isolation.

Terraform vs OpenTofu (2026)

IBM acquired HashiCorp for $6.4B (closed Feb 2025). Terraform stays BSL 1.1; OpenTofu is Linux Foundation/MPL.

Choose Terraform: HCP/TFE, Stacks, or vendor support.
Choose OpenTofu: client-side state encryption, BSL concerns, enabled, OCI registries, or Linux Foundation governance.
Shared protocol: most providers still work on both, for now.
CDKTF: deprecated Dec 2025 and archived; migrate to HCL or AWS CDK.

When to use

Writing or reviewing Terraform/OpenTofu configurations
Designing module architecture or registry patterns
Planning state management, backend strategy, or migration
Setting up CI/CD pipelines for IaC (plan/apply workflows)
Implementing policy-as-code gates (Checkov, OPA, Sentinel)
PCI-DSS 4.0 compliance for infrastructure provisioning
Multi-account/multi-cloud architecture with blast radius controls
Reviewing AI-generated Terraform for security and correctness

When NOT to use

Kubernetes manifests or Helm charts (use kubernetes)
Read-only Kubernetes cluster health checks after provisioning or maintenance (use cluster-health)
Ansible playbooks or configuration management (use ansible)
Docker/container optimization (use docker)
CI/CD pipeline design (use ci-cd)
Database engine configuration, schema design, or migrations (use databases)
Security auditing application code (use security-audit)

AI Self-Check

AI tools consistently produce the same Terraform mistakes. Before returning any generated HCL, verify against this list:

AI should never own terraform apply. In March 2026, an AI-assisted Terraform workflow deleted production infrastructure through escalating cleanup logic. Plan output is reviewed by a human. Always.

Current source checked: dated versions, CLI flags, API names, and support windows are verified against primary docs before repeating them
Hidden state identified: local config, credentials, caches, contexts, branches, cluster targets, or previous runs are made explicit before acting
Verification is real: final checks exercise the actual runtime, parser, service, or integration point instead of only linting prose or happy paths
Routing overlap checked: overlapping skills, trigger terms, and "When NOT to use" boundaries are checked before returning guidance
Spec claims verified: claims about tool behavior, output contracts, or repo conventions are checked against current docs, scripts, or skill files
Provider docs checked: resource arguments, defaults, and deprecations match pinned provider versions
State impact reviewed: imports, moves, destroys, and replacements are visible in plan output before apply

Performance

Scope plans to changed stacks/modules during iteration, then run full plans before merge.
Use remote state and data sources sparingly; excessive cross-stack reads slow plans and create hidden coupling.
Cache providers in CI and pin versions to avoid repeated downloads and surprise upgrades.

Best Practices

Never let automation apply production plans without a reviewed plan artifact and human approval.
Use moved blocks for refactors instead of delete/recreate churn.
Protect stateful resources with backups, prevent_destroy, and explicit migration steps.

Workflow

Step 1: Determine the domain

Based on the request:

"Create a VPC/RDS/EC2/resource" -> HCL
"Create a reusable module" -> Modules
"Set up state backend" / "migrate state" -> Operations
"Make this PCI compliant" / "policy gates" -> Compliance
"Review this Terraform" -> Apply production checklist + critical rules + AI self-check
"Review S3 buckets" -> S3 hardening review (see below) + AI self-check

Step 2: Gather requirements

Before writing HCL, determine:

Cloud provider(s) and account/project structure
Resource type and its dependencies
Environment (dev/staging/prod) and promotion strategy
State backend and locking mechanism
Compliance scope: PCI CDE? Regulated? What tags/policies apply?
Existing modules: reuse before creating new ones
Secrets: how are they injected? (Vault, SSM, Secrets Manager - never tfvars)

Step 3: Build

Follow the domain-specific section below. Always terraform fmt + terraform validate + run Checkov before finishing.

Step 4: Validate

terraform fmt -check -recursive              # Format check
terraform validate                            # Syntax + provider validation
tflint --recursive                           # Provider-specific linting
checkov -d . --framework terraform           # Security/compliance scan
terraform plan -out=plan.tfplan              # Review the plan
terraform show -json plan.tfplan | conftest test -  # Policy-as-code gate (OPA)

HCL Patterns

Resource structure

resource "aws_instance" "web" {
  ami           = var.ami_id
  instance_type = var.instance_type
  subnet_id     = var.private_subnet_id

  root_block_device {
    encrypted   = true
    kms_key_id  = var.kms_key_arn
    volume_size = 20
  }

  metadata_options {
    http_endpoint = "enabled"
    http_tokens   = "required"  # IMDSv2 - enforce this always
  }

  tags = merge(var.common_tags, {
    Name = "${var.project}-web-${var.environment}"
  })

  lifecycle {
    create_before_destroy = true
  }
}

Key patterns

Variables: type them. Default non-sensitive ones. Mark secrets sensitive. Use validation blocks for constraints.

variable "environment" {
  type        = string
  description = "Deployment environment"
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Must be dev, staging, or prod."
  }
}

variable "db_password" {
  type

terraform

How to add

Drop this on your repo README

Related skills

internal-comms

babysit

do

smart-explore

Get new DevOps e Infra skills every Monday