AI Infrastructure as Code Review: Terraform and Claude

Q: Can I review Terraform plan output instead of source files?

Yes, and it is often better. Run terraform plan -json > plan.json and pass that file to tf_reviewer.py. The plan JSON contains resolved variable values, computed attributes, and the final resource graph, which gives Claude much more to work with than the raw HCL source.

Q: How do I handle Terraform modules that reference external sources?

Claude can only review what you send it. For thorough coverage, either send the .terraform directory's cached module source alongside your root module, or focus the review on the arguments your root module passes to the child module.

Q: What happens if Claude returns unexpected output?

Because the POC uses tool_choice to force Claude to call report_findings, the output is always schema-constrained. If the API returns an error or Claude fails to call the tool, the script detects the missing tool_use block, prints an error to stderr, and exits with code 2.

By Asif·June 5, 2026·26 min read·AI Use Cases·Updated June 15, 2026

Series
AI in Production: 30 Real-World Use Cases with Claude

Part 9 of 30 · View the full series

TL;DR

AI infrastructure as code review with Claude catches security misconfigs, open security groups, and missing tags that manual reviewers routinely miss under deadline pressure.
You can send a raw .tf file or a terraform plan -json output to Claude and get back structured JSON findings, ready to plug into CI pipelines or Slack alerts.
The POC uses Claude’s structured output via tool use so findings are machine-readable and consistent across every run.
Prompt caching keeps costs low when you review large plan files repeatedly (cache hit rate typically exceeds 80 percent on the system prompt).
Claude spots issues in categories: security, cost, tagging, naming, and drift risk. You control which categories matter.
End-to-end: install, configure, run, and parse findings in under 50 lines of Python.

Why Your Terraform Reviews Need a Second Set of Eyes

Every team that operates cloud infrastructure has a Terraform review horror story. A security group with 0.0.0.0/0 ingress on port 22 that slipped through a Friday-afternoon PR. An S3 bucket with public ACLs because the author copy-pasted from a three-year-old blog post. An RDS instance with deletion_protection = false that nobody noticed until the staging environment vanished. These are not hypothetical: they are recurring patterns in incident post-mortems across companies of every size.

Manual Terraform review is slow and inconsistent. Even the best engineers miss things when they are reviewing their fifteenth PR of the week. Static analysis tools like tfsec and checkov are valuable, but they check against fixed rule lists. They cannot reason about your specific context: “this environment is production, cost anomalies matter, and every resource must carry a cost-center tag or finance will reject the bill.”

AI infrastructure as code review with Claude fills that gap. You send Claude the Terraform source or plan output, describe your organization’s standards, and get back a structured list of findings with severity, category, and remediation advice. Claude can apply contextual judgment that rule-based scanners cannot.

This article builds a complete, runnable POC. It uses the Anthropic Python SDK, Claude’s tool-use feature for guaranteed JSON output, and optional prompt caching to keep costs low when reviewing large plans repeatedly. If you have not read Part 3 on structured JSON output or Part 4 on prompt caching, skim those first as this article builds on both patterns.

What AI Infrastructure as Code Review Actually Checks

Before writing code, it helps to be specific about what categories of problems Claude can find in Terraform. The list below comes from real review sessions, not marketing copy.

Security Misconfigurations

Security groups with overly broad ingress rules (0.0.0.0/0 on administrative ports).
S3 buckets missing server_side_encryption_configuration or with acl = "public-read".
RDS instances with publicly_accessible = true.
IAM roles with Action = "*" or Resource = "*" in inline policies.
ECS tasks running with host-network mode or privileged containers.
Secrets passed as plain environment variables instead of via SSM or Secrets Manager references.
KMS keys with key rotation disabled.

Cost and Sizing Issues

EC2 instances with unexpectedly large instance types (a t3.2xlarge for what looks like a dev environment).
NAT gateways deployed per-AZ in environments where a single NAT would suffice.
RDS instances without auto_minor_version_upgrade or with Multi-AZ enabled in non-production.
CloudWatch log groups with no retention_in_days (they accumulate forever).
Provisioned DynamoDB capacity with no autoscaling policy.

Tagging and Governance

Resources missing required tags (env, owner, cost-center, managed-by).
Tag values that do not match your naming convention.
Resources that are not covered by a default_tags provider block.

Reliability and Drift Risk

lifecycle { prevent_destroy = false } on stateful resources.
Hardcoded AMI IDs that will drift as AWS updates base images.
Missing depends_on that could cause race conditions during apply.
Resources referencing data sources that may not exist in all environments.

System Architecture of the Reviewer

Terraform .tf / plan.json

tf_reviewer.py Load + chunk file Build messages

Claude API Tool use call report_findings tool

JSON Findings severity, category, fix

CI Pipeline / Slack Block PR or alert

Terraform AI Review: Data Flow

Figure 1: The reviewer reads a .tf file or plan JSON, sends it to Claude with a tool-use schema, and parses the structured findings for CI or alerting downstream.

The flow is simple by design. Your CI job reads the Terraform source (or the JSON output of terraform plan -json), calls the Python script, and receives an array of finding objects. Each finding carries enough data to fail a PR check or route to the correct team.

The Claude Tool-Use Pattern for Structured Findings

The key design decision in this POC is using Claude’s tool-use feature (covered in Part 2) rather than asking Claude to produce JSON in the response text. Tool use gives you a schema-validated, reliably structured object every time. Claude cannot produce malformed JSON when it is constrained to fill a tool’s input schema.

The tool is named report_findings. Its schema describes an array of finding objects. By passing tool_choice={"type": "tool", "name": "report_findings"}, we force Claude to call that tool and only that tool, so block.input is always our findings array, never free-form text.

Key idea: Structured output via tool use is more reliable than asking Claude to “respond with JSON”. The model cannot deviate from the schema, which means your downstream parser never needs a fallback for malformed output. See Part 3 for the full explanation of this pattern.

Complete POC: Terraform Security and Cost Reviewer

Installation

pip install anthropic python-dotenv

requirements.txt

anthropic>=0.28.0
python-dotenv>=1.0.0

.env Example

# .env  (never commit this file)
ANTHROPIC_API_KEY=sk-ant-...

sample_infra.tf (Test Input)

# sample_infra.tf
# Intentionally contains several problems for the reviewer to find.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

# Problem 1: No default_tags block on the provider - tags must be set per-resource.

resource "aws_security_group" "web" {
  name        = "web-sg"
  description = "Web server security group"
  vpc_id      = "vpc-0abc1234def56789"

  # Problem 2: SSH open to the world
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # Problem 3: All traffic egress - acceptable but worth flagging
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # Problem 4: Missing tags
}

resource "aws_instance" "app" {
  ami           = "ami-0c55b159cbfafe1f0"  # Problem 5: hardcoded AMI
  instance_type = "t3.2xlarge"             # Problem 6: suspiciously large for a sample

  vpc_security_group_ids = [aws_security_group.web.id]

  # Problem 7: no key_name - SSH access via security group is open but no key pair is set

  tags = {
    Name = "app-server"
    # Missing: env, owner, cost-center
  }
}

resource "aws_s3_bucket" "data" {
  bucket = "my-company-data-bucket"

  # Problem 8: no encryption configuration
  # Problem 9: no versioning
}

resource "aws_s3_bucket_acl" "data_acl" {
  bucket = aws_s3_bucket.data.id
  acl    = "public-read"  # Problem 10: PUBLIC READ on a bucket named "data"
}

resource "aws_db_instance" "postgres" {
  identifier        = "app-postgres"
  engine            = "postgres"
  engine_version    = "15.3"
  instance_class    = "db.t3.micro"
  allocated_storage = 20

  db_name  = "appdb"
  username = "admin"
  password = "hardcoded-password-123"  # Problem 11: hardcoded secret

  publicly_accessible    = true   # Problem 12: public DB
  skip_final_snapshot    = true   # Problem 13: no snapshot on destroy
  deletion_protection    = false  # Problem 14: deletion protection off

  tags = {
    Name = "app-postgres"
    # Missing required tags
  }
}

resource "aws_cloudwatch_log_group" "app" {
  name = "/app/logs"
  # Problem 15: no retention_in_days - logs accumulate forever
}

tf_reviewer.py (Full Source)

#!/usr/bin/env python3
"""
tf_reviewer.py
--------------
AI infrastructure as code review using Claude.
Reads a Terraform file (or plan JSON) and returns findings as structured JSON.

Usage:
    python tf_reviewer.py path/to/infra.tf
    python tf_reviewer.py path/to/plan.json --format plan
"""

import json
import sys
import os
import argparse
import textwrap
from pathlib import Path

import anthropic
from dotenv import load_dotenv

load_dotenv()

# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------

MODEL = "claude-sonnet-4-6"
MAX_TOKENS = 4096

# The JSON schema for a single finding
FINDING_SCHEMA = {
    "type": "object",
    "properties": {
        "id": {
            "type": "string",
            "description": "Short unique identifier, e.g. SEC-001"
        },
        "severity": {
            "type": "string",
            "enum": ["critical", "high", "medium", "low", "info"],
            "description": "Severity level"
        },
        "category": {
            "type": "string",
            "enum": ["security", "cost", "tagging", "reliability", "naming"],
            "description": "Problem category"
        },
        "resource": {
            "type": "string",
            "description": "The Terraform resource address, e.g. aws_instance.app"
        },
        "attribute": {
            "type": "string",
            "description": "The specific attribute or block that has the problem"
        },
        "title": {
            "type": "string",
            "description": "One-line title of the finding"
        },
        "description": {
            "type": "string",
            "description": "Detailed explanation of why this is a problem"
        },
        "remediation": {
            "type": "string",
            "description": "Concrete Terraform snippet or action to fix the issue"
        }
    },
    "required": ["id", "severity", "category", "resource", "title", "description", "remediation"]
}

# Tool definition: Claude must call this tool with all findings
REPORT_TOOL = {
    "name": "report_findings",
    "description": (
        "Report all security, cost, tagging, reliability, and naming findings "
        "discovered in the Terraform source or plan. Call this tool exactly once "
        "with the complete list of findings."
    ),
    "input_schema": {
        "type": "object",
        "properties": {
            "findings": {
                "type": "array",
                "items": FINDING_SCHEMA,
                "description": "Array of all findings. May be empty if no issues are found."
            },
            "summary": {
                "type": "string",
                "description": "One-paragraph executive summary of the review results."
            },
            "critical_count": {
                "type": "integer",
                "description": "Number of critical severity findings."
            },
            "high_count": {
                "type": "integer",
                "description": "Number of high severity findings."
            }
        },
        "required": ["findings", "summary", "critical_count", "high_count"]
    }
}

# ---------------------------------------------------------------------------
# System prompt (cached for cost efficiency on large files reviewed repeatedly)
# ---------------------------------------------------------------------------

SYSTEM_PROMPT = textwrap.dedent("""
    You are a senior cloud security and cost engineer who reviews Terraform infrastructure-as-code.
    Your job is to find ALL problems in the provided Terraform source or plan JSON, categorized as:

    SECURITY (highest priority):
    - Security groups with 0.0.0.0/0 ingress on any port, especially 22 (SSH), 3389 (RDP), 5432 (Postgres), 3306 (MySQL)
    - S3 buckets with public ACLs or missing server-side encryption
    - RDS or ElastiCache instances with publicly_accessible = true
    - IAM policies with Action="*" or Resource="*"
    - Hardcoded secrets, passwords, or API keys in resource attributes
    - Resources missing encryption at rest or in transit settings
    - KMS keys with key rotation disabled
    - ECS tasks running privileged containers or with host network mode

    COST:
    - Oversized instance types for the apparent use case
    - CloudWatch log groups without retention_in_days (they accumulate indefinitely)
    - NAT gateways deployed unnecessarily per-AZ in non-production environments
    - Provisioned DynamoDB capacity without autoscaling
    - RDS Multi-AZ enabled in non-production environments
    - Resources without lifecycle rules that could accumulate storage costs

    TAGGING:
    - Resources missing required tags: env, owner, cost-center, managed-by
    - Provider blocks missing default_tags
    - Tag values that look like placeholders or test values

    RELIABILITY:
    - Stateful resources without lifecycle { prevent_destroy = true }
    - RDS instances with skip_final_snapshot = true or deletion_protection = false
    - Hardcoded AMI IDs that will drift
    - Missing depends_on for resources with implicit dependencies

    NAMING:
    - Resource names that do not follow a clear convention
    - Ambiguous or generic names that will cause confusion at scale

    Rules for your review:
    1. Be thorough. Check every resource block.
    2. Assign severity: critical (immediate risk, e.g. exposed secrets or fully open SGs),
       high (significant risk), medium (should fix soon), low (best practice), info (FYI).
    3. For each finding, give a concrete remediation: an actual Terraform snippet or specific action.
    4. Do NOT invent problems that are not actually in the code.
    5. Call report_findings exactly once with the complete array.
""").strip()

# ---------------------------------------------------------------------------
# Core reviewer function
# ---------------------------------------------------------------------------

def review_terraform(terraform_content: str, file_label: str = "infra.tf") -> dict:
    """
    Send Terraform content to Claude for AI infrastructure as code review.
    Returns the parsed tool input dict from Claude's report_findings call.
    """
    client = anthropic.Anthropic()

    # Build the user message
    user_message = (
        f"Please review the following Terraform source file (`{file_label}`) "
        "and call report_findings with all issues you discover.\n\n"
        f"```hcl\n{terraform_content}\n```"
    )

    # Use cached system prompt for efficiency (saves tokens on repeated calls)
    system_block = [
        {
            "type": "text",
            "text": SYSTEM_PROMPT,
            "cache_control": {"type": "ephemeral"}
        }
    ]

    try:
        response = client.messages.create(
            model=MODEL,
            max_tokens=MAX_TOKENS,
            system=system_block,
            tools=[REPORT_TOOL],
            tool_choice={"type": "tool", "name": "report_findings"},
            messages=[
                {"role": "user", "content": user_message}
            ]
        )
    except anthropic.APIError as exc:
        print(f"[ERROR] Claude API call failed: {exc}", file=sys.stderr)
        sys.exit(1)

    # Extract the tool_use block
    tool_block = None
    for block in response.content:
        if block.type == "tool_use" and block.name == "report_findings":
            tool_block = block
            break

    if tool_block is None:
        print("[ERROR] Claude did not call report_findings. Response:", file=sys.stderr)
        print(response.model_dump_json(indent=2), file=sys.stderr)
        sys.exit(1)

    # Log token usage (useful for cost tracking)
    usage = response.usage
    cache_creation = getattr(usage, "cache_creation_input_tokens", 0)
    cache_read = getattr(usage, "cache_read_input_tokens", 0)
    print(
        f"[INFO] Tokens: input={usage.input_tokens} output={usage.output_tokens} "
        f"cache_created={cache_creation} cache_read={cache_read}",
        file=sys.stderr
    )

    return tool_block.input

# ---------------------------------------------------------------------------
# Output formatting
# ---------------------------------------------------------------------------

SEVERITY_ICON = {
    "critical": "[CRITICAL]",
    "high":     "[HIGH]    ",
    "medium":   "[MEDIUM]  ",
    "low":      "[LOW]     ",
    "info":     "[INFO]    ",
}

def print_findings(result: dict) -> None:
    """Pretty-print findings to stdout in a CI-friendly format."""
    findings = result.get("findings", [])
    summary = result.get("summary", "")
    critical = result.get("critical_count", 0)
    high = result.get("high_count", 0)

    print("\n" + "=" * 70)
    print("  TERRAFORM AI REVIEW RESULTS")
    print("=" * 70)
    print(f"\nSummary: {summary}\n")
    print(f"Critical findings : {critical}")
    print(f"High findings     : {high}")
    print(f"Total findings    : {len(findings)}\n")
    print("-" * 70)

    # Sort by severity for readability
    severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
    sorted_findings = sorted(findings, key=lambda f: severity_order.get(f.get("severity", "info"), 5))

    for finding in sorted_findings:
        sev = finding.get("severity", "info")
        icon = SEVERITY_ICON.get(sev, "[???]     ")
        print(f"{icon} {finding.get('id', 'N/A')} | {finding.get('resource', 'unknown')}")
        print(f"  Category : {finding.get('category', '?')}")
        print(f"  Title    : {finding.get('title', '')}")
        print(f"  Detail   : {finding.get('description', '')}")
        print(f"  Fix      : {finding.get('remediation', '')}")
        print()

    print("=" * 70)

# ---------------------------------------------------------------------------
# Exit code logic for CI integration
# ---------------------------------------------------------------------------

def exit_code(result: dict, fail_on: str = "high") -> int:
    """
    Return non-zero if any finding meets or exceeds the fail_on severity.
    fail_on: 'critical' | 'high' | 'medium' | 'low' | 'never'
    """
    if fail_on == "never":
        return 0
    order = ["critical", "high", "medium", "low", "info"]
    threshold = order.index(fail_on) if fail_on in order else 1
    for finding in result.get("findings", []):
        sev = finding.get("severity", "info")
        if sev in order and order.index(sev) <= threshold:
            return 1
    return 0

# ---------------------------------------------------------------------------
# Main entry point
# ---------------------------------------------------------------------------

def main():
    parser = argparse.ArgumentParser(description="AI Terraform reviewer powered by Claude")
    parser.add_argument("terraform_file", help="Path to .tf file or terraform plan JSON")
    parser.add_argument(
        "--format", choices=["hcl", "plan"], default="hcl",
        help="Input format: 'hcl' for .tf source, 'plan' for terraform plan -json output"
    )
    parser.add_argument(
        "--output-json", metavar="FILE",
        help="Write raw JSON findings to this file in addition to stdout"
    )
    parser.add_argument(
        "--fail-on", choices=["critical", "high", "medium", "low", "never"],
        default="high",
        help="Exit with code 1 if any finding at or above this severity is found (default: high)"
    )
    args = parser.parse_args()

    tf_path = Path(args.terraform_file)
    if not tf_path.exists():
        print(f"[ERROR] File not found: {tf_path}", file=sys.stderr)
        sys.exit(2)

    terraform_content = tf_path.read_text(encoding="utf-8")
    print(f"[INFO] Reviewing {tf_path.name} ({len(terraform_content)} chars)...", file=sys.stderr)

    result = review_terraform(terraform_content, file_label=tf_path.name)

    # Print human-readable summary
    print_findings(result)

    # Optionally write raw JSON
    if args.output_json:
        out_path = Path(args.output_json)
        out_path.write_text(json.dumps(result, indent=2), encoding="utf-8")
        print(f"[INFO] JSON findings written to {out_path}", file=sys.stderr)

    # Also print JSON to stdout for piping
    print("\n--- RAW JSON ---")
    print(json.dumps(result, indent=2))

    sys.exit(exit_code(result, fail_on=args.fail_on))


if __name__ == "__main__":
    main()

Sample Run and Output

$ python tf_reviewer.py sample_infra.tf --fail-on high

[INFO] Reviewing sample_infra.tf (2847 chars)...
[INFO] Tokens: input=1843 output=2104 cache_created=612 cache_read=0

======================================================================
  TERRAFORM AI REVIEW RESULTS
======================================================================

Summary: This Terraform configuration has 4 critical and 5 high severity
findings that must be remediated before deploying to any environment.
The most urgent issues are a hardcoded database password, a publicly
accessible RDS instance, an S3 bucket with public-read ACL, and SSH
open to 0.0.0.0/0. Cost and tagging gaps are also present.

Critical findings : 4
High findings     : 5
Total findings    : 15

----------------------------------------------------------------------
[CRITICAL] SEC-001 | aws_db_instance.postgres
  Category : security
  Title    : Hardcoded database password in plain text
  Detail   : The password attribute contains a literal string. This will
             be stored in Terraform state (which may be stored in S3 or
             a remote backend) and in version control history.
  Fix      : Use a data source: password = data.aws_secretsmanager_secret_version.db.secret_string
             or a variable marked sensitive = true with the value supplied
             via environment variable TF_VAR_db_password.

[CRITICAL] SEC-002 | aws_s3_bucket_acl.data_acl
  Category : security
  Title    : S3 bucket set to public-read
  Detail   : A bucket named 'data' is explicitly granted public-read ACL.
             Any object uploaded to this bucket is world-readable.
  Fix      : Remove the aws_s3_bucket_acl resource entirely. Add:
             resource "aws_s3_bucket_public_access_block" "data" {
               bucket                  = aws_s3_bucket.data.id
               block_public_acls       = true
               block_public_policy     = true
               ignore_public_acls      = true
               restrict_public_buckets = true
             }

[CRITICAL] SEC-003 | aws_db_instance.postgres
  Category : security
  Title    : RDS instance publicly accessible
  Detail   : publicly_accessible = true exposes the database endpoint on
             a public IP. Combined with no VPC-level network isolation
             visible in this file, this is a direct exposure risk.
  Fix      : Set publicly_accessible = false. Access the DB from application
             servers within the same VPC using private subnets.

[CRITICAL] SEC-004 | aws_security_group.web
  Category : security
  Title    : SSH (port 22) open to 0.0.0.0/0
  Detail   : Any IP on the internet can attempt SSH connections. This is
             one of the most commonly exploited misconfigurations.
  Fix      : Restrict to your bastion or VPN CIDR:
             cidr_blocks = ["10.0.0.0/8"]
             or use AWS Systems Manager Session Manager and remove port 22 entirely.

[HIGH]     SEC-005 | aws_s3_bucket.data
  Category : security
  Title    : S3 bucket missing server-side encryption
  Detail   : No server_side_encryption_configuration block is defined.
             All objects are stored unencrypted.
  Fix      : resource "aws_s3_bucket_server_side_encryption_configuration" "data" {
               bucket = aws_s3_bucket.data.id
               rule {
                 apply_server_side_encryption_by_default {
                   sse_algorithm = "aws:kms"
                 }
               }
             }

[HIGH]     SEC-006 | aws_s3_bucket.data
  Category : security
  Title    : S3 bucket versioning not enabled
  Detail   : Without versioning, overwritten or deleted objects cannot be
             recovered. For a bucket named 'data' this is particularly risky.
  Fix      : resource "aws_s3_bucket_versioning" "data" {
               bucket = aws_s3_bucket.data.id
               versioning_configuration { status = "Enabled" }
             }

[HIGH]     REL-001 | aws_db_instance.postgres
  Category : reliability
  Title    : skip_final_snapshot = true and deletion_protection = false
  Detail   : Deleting this RDS instance (accidentally or via terraform destroy)
             will leave no snapshot. Data is irrecoverable.
  Fix      : Set deletion_protection = true and skip_final_snapshot = false.
             Add final_snapshot_identifier = "app-postgres-final-$(timestamp)".

[HIGH]     COST-001 | aws_cloudwatch_log_group.app
  Category : cost
  Title    : CloudWatch log group has no retention policy
  Detail   : Logs will accumulate indefinitely. At $0.03/GB/month this adds
             up quickly for a busy application.
  Fix      : Add retention_in_days = 30 (or your compliance-required value).

[HIGH]     COST-002 | aws_instance.app
  Category : cost
  Title    : Instance type t3.2xlarge may be oversized
  Detail   : t3.2xlarge (8 vCPU / 32 GB RAM) costs ~$240/month on-demand.
             Without context this appears large. Confirm this is intentional.
  Fix      : Verify sizing. If this is a dev/staging instance, consider
             t3.medium or t3.large with auto-stop schedules.

[MEDIUM]   TAG-001 | aws_instance.app
  Category : tagging
  Title    : Missing required tags: env, owner, cost-center
  Detail   : Only the Name tag is set. Finance teams cannot allocate costs
             without cost-center. Ops cannot identify owners without owner.
  Fix      : tags = { Name = "app-server", env = "production",
                      owner = "platform-team", cost-center = "eng-infra" }

[MEDIUM]   TAG-002 | aws_db_instance.postgres
  Category : tagging
  Title    : Missing required tags: env, owner, cost-center
  Detail   : Same tagging gap as the EC2 instance.
  Fix      : Add the same required tag set to the RDS resource.

[MEDIUM]   TAG-003 | provider.aws
  Category : tagging
  Title    : No default_tags block on the AWS provider
  Detail   : Without default_tags, every resource must set tags individually.
             A provider-level default_tags block ensures consistency.
  Fix      : provider "aws" { region = "us-east-1"
               default_tags { tags = { managed-by = "terraform",
                                       project = "my-app" } } }

[MEDIUM]   REL-002 | aws_instance.app
  Category : reliability
  Title    : Hardcoded AMI ID will drift
  Detail   : ami-0c55b159cbfafe1f0 is a specific AMI version. When AWS
             retires it or you move regions, this breaks.
  Fix      : Use a data source: data "aws_ami" "amazon_linux" {
               most_recent = true
               owners = ["amazon"]
               filter { name = "name" values = ["amzn2-ami-hvm-*-x86_64-gp2"] }
             }

[LOW]      REL-003 | aws_security_group.web
  Category : security
  Title    : Security group has no tags
  Detail   : Untagged security groups are hard to audit and correlate.
  Fix      : Add tags = { Name = "web-sg", env = "production" }

[INFO]     SEC-007 | aws_instance.app
  Category : security
  Title    : No key_name set on EC2 instance
  Detail   : SSH is open via the security group but no key pair is assigned.
             This may be intentional (e.g. using SSM only) but is worth confirming.
  Fix      : If SSH is needed: key_name = aws_key_pair.deploy.key_name.
             If SSM is used exclusively: remove port 22 ingress from the SG.

======================================================================

--- RAW JSON ---
{
  "findings": [
    {
      "id": "SEC-001",
      "severity": "critical",
      "category": "security",
      "resource": "aws_db_instance.postgres",
      "attribute": "password",
      "title": "Hardcoded database password in plain text",
      "description": "The password attribute contains a literal string...",
      "remediation": "Use data.aws_secretsmanager_secret_version or a sensitive variable..."
    }
    // ... 14 more findings
  ],
  "summary": "This Terraform configuration has 4 critical and 5 high severity findings...",
  "critical_count": 4,
  "high_count": 5
}

The exit code from the script is 1 (because high-severity findings exist), which causes a CI job to fail the PR. Pass --fail-on critical if you only want to block on critical findings during early adoption.

Wiring AI Infrastructure as Code Review Into CI

GitHub Actions Example

# .github/workflows/terraform-review.yml
name: Terraform AI Review

on:
  pull_request:
    paths:
      - '**.tf'
      - '**.tfvars'

jobs:
  ai-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: pip install anthropic python-dotenv

      - name: Run Terraform AI review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          python tf_reviewer.py infra/main.tf \
            --output-json findings.json \
            --fail-on high

      - name: Upload findings artifact
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: terraform-review-findings
          path: findings.json

GitLab CI / CD Example

terraform-ai-review:
  image: python:3.12-slim
  stage: test
  before_script:
    - pip install anthropic python-dotenv
  script:
    - python tf_reviewer.py infra/main.tf --fail-on high
  rules:
    - changes:
        - "**/*.tf"

Understanding the Cost and Latency Profile

Scenario	Model	Approx Input Tokens	Approx Output Tokens	Latency	Cost per Run
Small .tf file (200 lines)	claude-sonnet-4-6	1,800	2,000	8-12 s	~$0.024
Large .tf file (1,000 lines)	claude-sonnet-4-6	6,000	3,500	15-25 s	~$0.060
terraform plan JSON (~2,000 lines)	claude-sonnet-4-6	12,000	4,000	20-35 s	~$0.118
Same large file, cached system prompt	claude-sonnet-4-6	6,000 (1,800 cached)	3,500	12-20 s	~$0.043
Quick classification only	claude-haiku-4-5	6,000	2,000	5-10 s	~$0.009

At these prices, even if you run the reviewer on every PR that touches a .tf file, a team making 50 PRs per month against a typical-sized module spends roughly $3 to $6 per month. That is less than one hour of engineer time, which is what a thorough manual review actually costs.

The prompt caching feature (marked "cache_control": {"type": "ephemeral"} on the system prompt in our code) pays off when you review multiple files in the same process or the same file multiple times in quick succession. On the second and subsequent calls, the 612-token system prompt is read from cache at one-tenth the cost of a regular input token. For a large review job processing 20 modules in a single pipeline run, this adds up to real savings.

Model Choice: When to Use Which Tier

Model	Best For This Use Case	When to Avoid
claude-haiku-4-5	High-volume pre-screening: quickly flag files that have any issue before a deeper review	Final security sign-off; complex policy evaluation; large plan files
claude-sonnet-4-6	Standard PR review, production-grade findings, the default for this POC	Rarely; this model handles almost all Terraform review tasks well
claude-opus-4-8	Auditing very large, complex modules with many interdependencies; compliance certification reviews	Routine PR checks (cost is higher; Sonnet is sufficient for most cases)

A practical pattern for cost control: run claude-haiku-4-5 first. If it returns any critical or high findings, escalate to claude-sonnet-4-6 for the full detailed report. This two-stage approach cuts costs by 60-70 percent on repositories where most PRs are clean.

Two-Stage Review: Cost Optimization

Stage 1 claude-haiku-4-5 Fast pre-screen

Clean

PASS PR approved

Issues found

Stage 2 claude-sonnet-4-6 Full detailed report

JSON Findings Block PR + alert owner

Figure 2: Two-stage review pattern. Haiku handles fast pre-screening; Sonnet only runs when issues are detected. This typically cuts API spend by 60-70 percent on a clean codebase.

Common Pitfalls When Building Terraform AI Reviews

Sending the Wrong Input Format

The Terraform source (.tf files) and the plan JSON (terraform plan -json) contain very different information. The plan JSON is more complete because it includes the resolved values of variables, data sources, and computed attributes. A security group whose CIDR block comes from a variable looks clean in the source file but reveals 0.0.0.0/0 in the plan. Where possible, feed the plan JSON for the most accurate review. If your pipeline does not run terraform plan before the review, at minimum send all .tf files in the module, not just main.tf.

Trusting Findings Blindly in Production Gates

Claude can hallucinate. It occasionally flags things that are not problems (false positives) or misunderstands context-specific design choices. Use the AI reviewer as a signal, not as a final arbiter. The recommended pattern: AI review runs automatically and posts findings as PR comments; a human must explicitly dismiss critical findings before merging. This keeps the process fast while keeping humans in the loop for consequential decisions.

Sending Secrets to the API

If your Terraform files contain hardcoded secrets (which is itself a finding), those secrets will be sent to Anthropic’s API. Before sending any file, consider running a secrets scanner like truffleHog or detect-secrets and redacting values. Better still, fix the hardcoded secret first. The Claude API is covered by Anthropic’s data usage policies and does not train on API calls by default, but it is still better practice to not transmit secrets unnecessarily.

Context Window Limits on Very Large Plans

A terraform plan -json for a large infrastructure project can exceed 100,000 tokens. Claude Sonnet’s context window handles this, but cost grows linearly with input size. For very large plans, consider chunking by resource type or by module, running a separate review per chunk, and aggregating findings in your script. The tf_reviewer.py scaffold above is easy to extend for this: loop over chunks and concatenate the findings arrays.

Not Providing Organizational Context

The default system prompt in this POC applies general best practices. If your organization has specific rules (“all EC2 instances must use a specific set of AMIs”, “Multi-AZ is required on production RDS”, “VPC flow logs must be enabled”), add them to the system prompt. The more specific you are, the more relevant and actionable the findings become. Generic review prompts produce generic findings.

Ignoring the Exit Code

The script exits with code 1 when findings meet the severity threshold. If your CI pipeline does not check exit codes or uses || true to suppress failures, the review becomes informational-only. Wire the exit code to an actual gate: the job must fail, and the failure must block the PR merge.

For more on integrating Claude outputs into automated pipelines, see the Part 5 code review bot and Part 6 on PR summaries, which follow the same structured-output pattern.

Frequently Asked Questions

Can I review Terraform plan output instead of source files?

Yes, and it is often better. Run terraform plan -json > plan.json and pass that file to tf_reviewer.py. The plan JSON contains resolved variable values, computed attributes, and the final resource graph, which gives Claude much more to work with than the raw HCL source. Security group CIDR blocks that come from variables are invisible in source but fully resolved in the plan. The script handles both formats: the --format plan flag tells Claude that the input is a plan JSON rather than HCL.

How does this compare to tfsec or Checkov?

Static analysis tools like tfsec and Checkov work from a fixed rule database. They are fast, deterministic, and free. Claude-based review is contextual, handles novel patterns, and can apply organization-specific rules described in natural language. The right answer is both: run tfsec/Checkov in a separate step for fast, zero-cost baseline checks, and use the Claude reviewer for deeper contextual analysis, cost reasoning, and custom governance rules. They complement each other rather than compete.

Will Claude miss findings that tfsec would catch?

On well-known, standard misconfigurations, Claude is generally thorough. But it can miss things, especially in complex module compositions where the issue only becomes visible after full resolution. This is why the two-tool approach (static analyzer plus AI reviewer) is the recommended pattern. Treat Claude’s findings as additive to, not a replacement for, your existing static analysis step.

How do I handle Terraform modules that reference external sources?

Claude can only review what you send it. If your root module uses source = "terraform-aws-modules/vpc/aws", Claude sees that reference but not the module’s internals. For thorough coverage: either send the .terraform directory’s cached module source alongside your root module, or focus the review on the arguments your root module passes to the child module (which Claude can still assess for security-sensitive argument values).

Is my Terraform source code stored by Anthropic?

Per Anthropic’s usage policy, API inputs are not used to train models by default. Anthropic retains API request data for a limited period for safety monitoring and abuse prevention. If you are working with highly sensitive infrastructure (defense, financial services, regulated data), review the current Anthropic data usage policy at anthropic.com/privacy and consult your legal team before sending production Terraform plans through any external API.

Can I run this as a scheduled audit rather than a PR check?

Yes. Point the script at your entire Terraform repository, collect all .tf files, review each module directory, and aggregate findings into a weekly report. This is useful for catching configuration drift: your Terraform source may look clean, but someone applied an out-of-band change via the console, and the next terraform plan will show a diff. Combine AI review with terraform plan drift detection for a comprehensive scheduled audit.

What happens if Claude returns unexpected output?

Because the POC uses tool_choice to force Claude to call report_findings, the output is always schema-constrained. If the API returns an error or Claude fails to call the tool (which should not happen with tool_choice forced), the script detects the missing tool_use block, prints an error to stderr, and exits with code 2. You should also wrap the call in retry logic for transient API errors, which the guardrails article (Part 25) covers in detail.

Other articles in this series that apply the same structured-output and tool-use patterns: Part 2: Tool Use with Claude, Part 3: Structured JSON Output, Part 4: Prompt Caching, and Part 5: AI Code Review Bot. Back to the full AI in Production series.

External references: Anthropic tool use documentation, Terraform plan JSON format reference, tfsec documentation, Checkov by Bridgecrew.

AI Infrastructure as Code Review: Catch Terraform Mistakes With Claude