Why infrastructure as code matters
Managing infrastructure manually through web consoles or ad-hoc scripts creates problems that pile up over time: inconsistent environments, undocumented changes, impossible rollbacks, and the classic “it works on my machine” extended to entire servers.
Infrastructure as Code (IaC) fixes this by treating infrastructure like application code: it’s written, versioned, reviewed, tested, and applied through automated workflows. The benefits show up right away:
- Reproducibility: Spin up identical environments in minutes, not days.
- Version control: Every infrastructure change goes through a PR with code review.
- Documentation by default: The code is the documentation of what your infrastructure looks like.
- Disaster recovery: Rebuild everything from code if a region goes down.
- Cost visibility: Review infrastructure changes before they are applied (and before they start costing money).
Terraform vs other tools
Several IaC tools exist. Here’s how Terraform compares to the main alternatives:
| Feature | Terraform | Pulumi | CloudFormation | Ansible |
|---|---|---|---|---|
| Language | HCL (declarative) | Python, TypeScript, Go, etc. | JSON/YAML | YAML (procedural) |
| Cloud support | Multi-cloud | Multi-cloud | AWS only | Multi-cloud (via modules) |
| State management | Explicit state file | Managed by Pulumi service | Managed by AWS | Stateless |
| Learning curve | Moderate | Varies by language | Moderate | Low |
| Ecosystem | Huge provider ecosystem | Growing | AWS-only but deep | Huge role ecosystem |
| Best for | Multi-cloud infra | Teams that prefer general-purpose languages | AWS-only shops | Configuration management |
Terraform’s sweet spot is multi-cloud infrastructure provisioning with a declarative approach. If you’re on AWS only and want tight integration, CloudFormation is reasonable. If your team prefers writing Python over HCL, Pulumi deserves a look. But for most teams managing infrastructure across providers, Terraform is the pragmatic choice.
Core concepts
Providers
Providers are plugins that let Terraform interact with APIs — AWS, Azure, GCP, Kubernetes, GitHub, Cloudflare, and hundreds more.
| |
Resources
Resources are the fundamental building blocks. Each resource block describes one infrastructure object.
| |
State
Terraform maintains a state file that maps your configuration to real-world resources. This is how Terraform knows what exists, what needs to change, and what to destroy. The state file is critical. Losing it means Terraform loses track of your infrastructure.
Modules
Modules are reusable packages of Terraform configuration. Think of them as functions: they take inputs (variables), create resources, and produce outputs.
| |
Practical example: VPC + EC2
Here’s a complete example that provisions a VPC with a public subnet and an EC2 instance:
| |
The plan/apply workflow
Terraform follows a predictable workflow:
| |
The terraform plan step is the most important. Never skip it. Always review the plan output before applying, especially in production. The plan shows you exactly what will be created, modified, or destroyed.
| |
In CI/CD pipelines, save the plan to a file (-out=tfplan) and apply that exact plan. This prevents race conditions where infrastructure changes between the plan and apply steps.
State management best practices
State management is where most Terraform problems originate. Follow these practices:
Use a remote backend
Never store state locally or in Git. Use a remote backend with encryption and locking:
| |
The DynamoDB table provides state locking. This prevents two people or pipelines from modifying the same infrastructure at the same time.
Organize state by component
Don’t put all your infrastructure in one state file. Split by component or team:
| |
Smaller state files mean faster plans, smaller blast radius, and fewer teams competing for locks.
Use terraform_remote_state sparingly
You can reference outputs from other state files, but use it carefully. Over-reliance on remote state creates tight coupling between components. Prefer passing values through variables or a parameter store.
Tips for production use
Pin provider versions. Use
~>constraints to allow patch updates but prevent breaking changes:version = "~> 5.0".Use workspaces carefully. Workspaces are useful for simple environment separation but get confusing at scale. Separate directories per environment is usually clearer.
Implement a CI/CD pipeline for Terraform. Run
terraform planon PRs and post the output as a PR comment. Runterraform applyonly after merge and approval.Use
prevent_destroyfor critical resources. This lifecycle rule stops accidental destruction of databases or persistent storage:1 2 3 4 5 6resource "aws_db_instance" "main" { # ... lifecycle { prevent_destroy = true } }Tag everything. Use a
default_tagsblock in the provider to ensure every resource gets standard tags (environment, team, project).Use
tflintandcheckov. Lint your Terraform code and scan for security misconfigurations before applying.1 2 3tflint --init tflint checkov -d .Import existing resources. If you have manually created infrastructure, use
terraform importto bring it under management instead of recreating it.Review the plan diff carefully. A resource showing “destroy and recreate” might cause downtime. Understand which changes are in-place versus destructive.
Terraform is one of those tools that rewards discipline. The more consistently you follow these practices, the more confidently your team manages infrastructure at scale.