<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Automation on Adur</title><link>https://adurrr.github.io/en/tags/automation/</link><description>Recent content in Automation on Adur</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Mon, 20 Oct 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://adurrr.github.io/en/tags/automation/index.xml" rel="self" type="application/rss+xml"/><item><title>Lessons learned in DevSecOps</title><link>https://adurrr.github.io/en/p/lessons-learned-in-devsecops/</link><pubDate>Mon, 20 Oct 2025 00:00:00 +0000</pubDate><guid>https://adurrr.github.io/en/p/lessons-learned-in-devsecops/</guid><description>&lt;p&gt;DevSecOps gets thrown around in job descriptions and conference talks a lot. But behind the buzzword are real lessons that only come from doing the work. From building pipelines that break when you add security gates, to watching teams ignore the tools you spent months deploying, to finally finding what actually works.&lt;/p&gt;
&lt;p&gt;These are lessons we learned the hard way. They&amp;rsquo;re opinionated, practical, shaped by experience.&lt;/p&gt;
&lt;h2 id="security-is-everyones-responsibility"&gt;Security is everyone&amp;rsquo;s responsibility
&lt;/h2&gt;&lt;p&gt;Sounds like a break room poster, but it&amp;rsquo;s the most important lesson here. If security is only the security team&amp;rsquo;s job, you&amp;rsquo;ve lost.&lt;/p&gt;
&lt;p&gt;Developers make security decisions every time they write code, whether they know it or not. How they validate input. How they handle secrets. How they configure network access. Every PR is a security event.&lt;/p&gt;
&lt;p&gt;What works: make security part of the normal development workflow, not a gate at the end. Developers learn when they get fast feedback on security issues in their PR. They resent finding out three weeks later from an auditor.&lt;/p&gt;
&lt;p&gt;We&amp;rsquo;ve seen this repeatedly: teams that treat security as shared responsibility find fewer critical vulnerabilities in production. Teams that silo it find them in the news.&lt;/p&gt;
&lt;h2 id="automate-everything-you-can"&gt;Automate everything you can
&lt;/h2&gt;&lt;p&gt;Manual security processes do not scale. Period. If your security review is a human reading a checklist, it will be skipped under deadline pressure, inconsistently applied, and resented by everyone involved.&lt;/p&gt;
&lt;p&gt;Automate the things that can be automated:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Dependency scanning&lt;/strong&gt; in every CI build (Dependabot, Snyk, Trivy)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Static analysis&lt;/strong&gt; on every pull request (Semgrep, SonarQube)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Secret detection&lt;/strong&gt; as a pre-commit hook and CI check (gitleaks, detect-secrets)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Container image scanning&lt;/strong&gt; before deployment (Trivy, Grype)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Infrastructure as Code scanning&lt;/strong&gt; (tfsec, Checkov, KICS)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compliance as Code&lt;/strong&gt; for runtime policy enforcement (OPA, Kyverno)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The goal is not to catch everything automatically. The goal is to catch the easy stuff automatically so that human reviewers can focus on the hard stuff: business logic flaws, design-level security issues, threat modeling.&lt;/p&gt;
&lt;h2 id="start-small"&gt;Start Small
&lt;/h2&gt;&lt;p&gt;One of the biggest mistakes we have made is trying to secure everything at once. You roll out SAST, DAST, SCA, container scanning, IaC scanning, and runtime protection in one quarter. The result? Alert fatigue, developer rebellion, and a wall of unresolved findings that nobody looks at.&lt;/p&gt;
&lt;p&gt;Start with one tool, one pipeline, one team. Get it working well. Get developers comfortable with it. Resolve the false positives. Tune the rules. Then expand.&lt;/p&gt;
&lt;p&gt;A practical progression:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Month 1&lt;/strong&gt;: Secret detection in pre-commit hooks and CI. This is uncontroversial and catches real issues.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Month 2&lt;/strong&gt;: Dependency scanning with automated PR creation for updates. Developers see the value immediately.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Month 3&lt;/strong&gt;: Container image scanning blocking deployments of critical/high vulnerabilities.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Month 4+&lt;/strong&gt;: Static analysis, gradually expanding rule sets.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Each step should be stable before moving to the next. Rushing creates noise, and noise teaches people to ignore alerts.&lt;/p&gt;
&lt;h2 id="blameless-culture-matters"&gt;Blameless culture matters
&lt;/h2&gt;&lt;p&gt;When a security incident happens because someone pushed a secret to a public repo, or because a vulnerability was not patched in time, the response matters more than the incident itself.&lt;/p&gt;
&lt;p&gt;If people get blamed, they hide things. They do not report near-misses. They cover up mistakes. And the next incident will be worse because nobody shared the lessons from the last one.&lt;/p&gt;
&lt;p&gt;Blameless postmortems are not about letting people off the hook. They are about understanding systemic failures. Why was it possible to push a secret? Why was there no scanning? Why was the patching process slow? Fix the system, not the person.&lt;/p&gt;
&lt;p&gt;We have found that teams with genuinely blameless cultures have significantly better security postures. People report suspicious things. They ask for help early. They flag risks before they become incidents.&lt;/p&gt;
&lt;h2 id="tooling-is-not-enough-without-culture-change"&gt;Tooling is not enough without culture change
&lt;/h2&gt;&lt;p&gt;We once deployed a comprehensive security scanning pipeline with beautiful dashboards, Slack notifications, Jira ticket creation, the works. Six months later, there were 3,000 unresolved findings and the Slack channel was muted by every developer.&lt;/p&gt;
&lt;p&gt;The tools were fine. The culture was not ready.&lt;/p&gt;
&lt;p&gt;Before you deploy tooling, invest in:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Training&lt;/strong&gt;: Developers need to understand why the tool exists and how to act on its findings.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ownership&lt;/strong&gt;: Someone needs to own the backlog of findings and triage them. If nobody owns it, nobody does it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SLAs&lt;/strong&gt;: Define clear timelines for remediating findings by severity. Critical gets 48 hours. High gets a week. Medium gets a sprint. Low gets a quarter.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Feedback loops&lt;/strong&gt;: When a tool produces a false positive, there must be an easy way to report it and get the rule tuned. Otherwise, developers learn to ignore everything.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="invest-in-developer-experience-for-security-tools"&gt;Invest in developer experience for security tools
&lt;/h2&gt;&lt;p&gt;If your security tool makes developers&amp;rsquo; lives harder, they will find a way around it. This is not a character flaw. It is human nature and good engineering instinct: remove obstacles to shipping.&lt;/p&gt;
&lt;p&gt;The security tools that get adopted are the ones that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Run fast&lt;/strong&gt;: A SAST scan that takes 20 minutes will be bypassed. One that takes 30 seconds will be tolerated.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Integrate natively&lt;/strong&gt;: Show results in the PR, not in a separate portal. Nobody wants to log into another dashboard.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Have low false positive rates&lt;/strong&gt;: Every false positive erodes trust. Invest time in tuning.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Provide actionable guidance&lt;/strong&gt;: &amp;ldquo;SQL injection vulnerability on line 42&amp;rdquo; is useless without &amp;ldquo;here is how to fix it.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fail gracefully&lt;/strong&gt;: If the scanner is down, the pipeline should warn, not block. Availability of the development pipeline is non-negotiable.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We think of it this way: if a developer has to change their workflow to accommodate a security tool, the tool has failed. The best security tooling is invisible.&lt;/p&gt;
&lt;h2 id="monitoring-and-observability-are-non-negotiable"&gt;Monitoring and observability are non-negotiable
&lt;/h2&gt;&lt;p&gt;You cannot secure what you cannot see. Security monitoring is not optional, and it is not something you bolt on after the fact.&lt;/p&gt;
&lt;p&gt;What this means in practice:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Centralized logging&lt;/strong&gt;: All application, infrastructure, and security tool logs in one place. If you have to SSH into a box to read logs, you are already behind.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Audit trails&lt;/strong&gt;: Who did what, when, and from where. Every deployment, every config change, every access request.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Alerting on anomalies&lt;/strong&gt;: Not just &amp;ldquo;is the service up?&amp;rdquo; but &amp;ldquo;is this access pattern normal?&amp;rdquo; Unusual API call volumes, access from new locations, privilege escalations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Runtime security&lt;/strong&gt;: Tools like Falco for container runtime monitoring. Know when something unexpected happens in production.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Monitoring is also how you prove to auditors and customers that your security controls are working. &amp;ldquo;Trust us&amp;rdquo; is not a compliance strategy.&lt;/p&gt;
&lt;h2 id="open-source-is-your-ally"&gt;Open source is your ally
&lt;/h2&gt;&lt;p&gt;Some of the best security tools available are open source. Trivy, Falco, OPA, Semgrep, gitleaks, cosign, KICS, Checkov. The ecosystem is rich and maturing fast.&lt;/p&gt;
&lt;p&gt;Benefits of open source security tooling:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Transparency&lt;/strong&gt;: You can read the rules and understand exactly what is being checked.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Community&lt;/strong&gt;: Thousands of contributors finding edge cases and adding detection rules.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No vendor lock-in&lt;/strong&gt;: You can switch tools without renegotiating a contract.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost&lt;/strong&gt;: Start for free, scale as needed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This does not mean commercial tools have no place. Some provide valuable aggregation, management, and support. But you can build a very solid security pipeline with open source tools alone, and we think every team should start there.&lt;/p&gt;
&lt;h2 id="continuous-learning-is-essential"&gt;Continuous learning is essential
&lt;/h2&gt;&lt;p&gt;The threat landscape changes constantly. The tools change. The best practices evolve. What was considered secure two years ago might have a CVE today.&lt;/p&gt;
&lt;p&gt;What we do to stay current:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Dedicate time for learning&lt;/strong&gt;: At least a few hours per sprint for the team to read about new vulnerabilities, tools, and techniques. This is not a nice-to-have. It is a professional requirement.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Run internal CTFs and tabletop exercises&lt;/strong&gt;: Nothing teaches security like trying to break things. Regular exercises keep skills sharp and reveal gaps in your defenses.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Participate in the community&lt;/strong&gt;: Attend meetups, contribute to open source, read advisories. The security community is generous with knowledge. Take advantage of it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Review and update&lt;/strong&gt;: Quarterly reviews of your security tooling, policies, and incident response procedures. What worked last quarter may not work next quarter.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="final-thoughts"&gt;Final Thoughts
&lt;/h2&gt;&lt;p&gt;DevSecOps isn&amp;rsquo;t a destination. There&amp;rsquo;s no point where you say &amp;ldquo;we&amp;rsquo;re done, we&amp;rsquo;re secure.&amp;rdquo; It&amp;rsquo;s a continuous practice of reducing risk, improving visibility, building a culture where security is as natural as writing tests.&lt;/p&gt;
&lt;p&gt;The most important lesson: perfect is the enemy of good. A basic security pipeline that developers actually use beats a comprehensive one they bypass. Start where you are, improve iteratively, never stop.&lt;/p&gt;</description></item><item><title>LLMOps: integrating LLMs into DevOps workflows</title><link>https://adurrr.github.io/en/p/llmops-integrating-llms-into-devops-workflows/</link><pubDate>Sun, 15 Jun 2025 00:00:00 +0000</pubDate><guid>https://adurrr.github.io/en/p/llmops-integrating-llms-into-devops-workflows/</guid><description>&lt;p&gt;LLMs have moved beyond chatbots. They&amp;rsquo;re now embedded in engineering workflows where they automate tedious tasks, speed incident response, and boost developer productivity. But deploying an LLM into a production DevOps pipeline is fundamentally different from using ChatGPT in a browser.&lt;/p&gt;
&lt;p&gt;This guide covers what LLMOps means in practice, where LLMs fit into DevOps, architecture patterns that work, and pitfalls to avoid.&lt;/p&gt;
&lt;h2 id="what-is-llmops"&gt;What is LLMOps?
&lt;/h2&gt;&lt;p&gt;LLMOps is the practices, tools, and infrastructure needed to operationalize LLMs. It extends MLOps but addresses challenges unique to language models:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Model selection vs. model training&lt;/strong&gt;: Most teams consume pre-trained models (via APIs or self-hosted inference) rather than training from scratch. The operational focus shifts to prompt engineering, fine-tuning, and retrieval-augmented generation (RAG).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost management&lt;/strong&gt;: LLM inference is expensive. Token-based pricing means costs scale with usage in ways that are harder to predict than traditional compute.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Non-determinism&lt;/strong&gt;: LLMs produce variable outputs for the same input, which complicates testing, validation, and reproducibility.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Latency&lt;/strong&gt;: Response times of seconds (not milliseconds) require different architectural patterns than traditional microservices.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;LLMOps is not a separate discipline. It is an extension of your existing DevOps and MLOps practices, adapted for the specific operational characteristics of language models.&lt;/p&gt;
&lt;h2 id="practical-use-cases-in-devops"&gt;Practical use cases in DevOps
&lt;/h2&gt;&lt;p&gt;Here is where LLMs are delivering real value in DevOps workflows today:&lt;/p&gt;
&lt;h3 id="automated-code-review"&gt;Automated code review
&lt;/h3&gt;&lt;p&gt;LLMs can provide a first-pass review of pull requests, catching common issues like missing error handling, security anti-patterns, inconsistent naming, or missing tests. They do not replace human reviewers but reduce the burden of repetitive feedback.&lt;/p&gt;
&lt;h3 id="incident-summarization"&gt;Incident summarization
&lt;/h3&gt;&lt;p&gt;When an incident fires at 3 AM, the on-call engineer needs context fast. An LLM can ingest alert data, recent deployment logs, related runbooks, and previous incident reports to produce a concise summary of what is likely going wrong and what was done last time.&lt;/p&gt;
&lt;h3 id="log-analysis"&gt;Log analysis
&lt;/h3&gt;&lt;p&gt;LLMs are surprisingly effective at pattern recognition in unstructured log data. Feed them a block of error logs and they can identify the root cause faster than manual grep sessions, especially for unfamiliar systems.&lt;/p&gt;
&lt;h3 id="documentation-generation"&gt;Documentation generation
&lt;/h3&gt;&lt;p&gt;Generating draft documentation from code, API schemas, or Terraform modules. The output needs human review, but it eliminates the blank-page problem and keeps docs closer to current state.&lt;/p&gt;
&lt;h3 id="infrastructure-as-code-generation"&gt;Infrastructure as Code generation
&lt;/h3&gt;&lt;p&gt;Given a natural language description of desired infrastructure, LLMs can generate Terraform, Ansible, or Kubernetes manifests as a starting point. Useful for scaffolding, not for production-ready code without review.&lt;/p&gt;
&lt;h2 id="architecture-patterns-for-llm-integration"&gt;Architecture patterns for LLM integration
&lt;/h2&gt;&lt;h3 id="pattern-1-api-gateway-to-external-llm"&gt;Pattern 1: API gateway to external LLM
&lt;/h3&gt;&lt;p&gt;The simplest approach. Your application calls an external LLM API (OpenAI, Anthropic, etc.) through a centralized gateway that handles authentication, rate limiting, logging, and cost tracking.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[CI/CD Pipeline] --&amp;gt; [API Gateway] --&amp;gt; [External LLM API]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; [Logging &amp;amp; Metrics]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; [Cost Tracking]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;: No infrastructure to manage, access to the most capable models, fast to implement.
&lt;strong&gt;Cons&lt;/strong&gt;: Data leaves your network, vendor lock-in, variable latency, ongoing API costs.&lt;/p&gt;
&lt;h3 id="pattern-2-self-hosted-inference"&gt;Pattern 2: Self-hosted inference
&lt;/h3&gt;&lt;p&gt;Run open-weight models (Llama, Mistral, etc.) on your own infrastructure using inference servers like vLLM or Ollama.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[CI/CD Pipeline] --&amp;gt; [Load Balancer] --&amp;gt; [vLLM / Ollama Instance(s)]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; [GPU Node Pool]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;: Data stays internal, predictable costs at scale, no vendor dependency, full control over model versions.
&lt;strong&gt;Cons&lt;/strong&gt;: Requires GPU infrastructure, operational overhead, smaller models may be less capable.&lt;/p&gt;
&lt;h3 id="pattern-3-rag-enhanced-pipeline"&gt;Pattern 3: RAG-enhanced pipeline
&lt;/h3&gt;&lt;p&gt;Combine an LLM with a retrieval system that provides relevant context from your own knowledge base (runbooks, documentation, past incidents). This dramatically improves response quality for domain-specific tasks.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;[Query] --&amp;gt; [Embedding Model] --&amp;gt; [Vector DB Search] --&amp;gt; [Context + Query] --&amp;gt; [LLM] --&amp;gt; [Response]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; |
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; [Your Knowledge Base]
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; (runbooks, docs, etc.)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;This pattern is particularly powerful for incident response and documentation tasks where the LLM needs your organization&amp;rsquo;s specific context.&lt;/p&gt;
&lt;h2 id="key-considerations"&gt;Key considerations
&lt;/h2&gt;&lt;h3 id="cost"&gt;Cost
&lt;/h3&gt;&lt;p&gt;LLM API costs can be surprising. A code review pipeline that processes 50 PRs per day with large diffs can easily run hundreds of dollars per month. Strategies to control costs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Set token limits per request&lt;/li&gt;
&lt;li&gt;Cache common queries and responses&lt;/li&gt;
&lt;li&gt;Use smaller models for simpler tasks (triage with a small model, escalate to a larger one)&lt;/li&gt;
&lt;li&gt;Monitor token usage per pipeline and set alerts&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="latency"&gt;Latency
&lt;/h3&gt;&lt;p&gt;LLM responses take seconds, not milliseconds. Design your integrations as asynchronous processes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Post code review comments after the fact, do not block the PR&lt;/li&gt;
&lt;li&gt;Process incident data in the background, push results to a Slack channel&lt;/li&gt;
&lt;li&gt;Use streaming responses where possible to improve perceived performance&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="hallucinations"&gt;Hallucinations
&lt;/h3&gt;&lt;p&gt;LLMs will confidently generate plausible-sounding but incorrect information. This is a critical concern for DevOps tasks where bad advice can cause outages.&lt;/p&gt;
&lt;p&gt;Mitigations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Always present LLM output as suggestions, never as authoritative actions&lt;/li&gt;
&lt;li&gt;Require human approval before any LLM-generated change is applied&lt;/li&gt;
&lt;li&gt;Use RAG to ground responses in verified documentation&lt;/li&gt;
&lt;li&gt;Implement output validation (e.g., lint generated IaC before presenting it)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="security"&gt;Security
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Data exposure&lt;/strong&gt;: Anything you send to an external LLM API may be used for training or stored. Never send secrets, credentials, or sensitive customer data.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Prompt injection&lt;/strong&gt;: Malicious content in code, logs, or user input can manipulate LLM behavior. Sanitize inputs and validate outputs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Supply chain&lt;/strong&gt;: LLM-generated code may introduce vulnerabilities. Run all generated code through your existing security scanning pipeline.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="tools-and-platforms"&gt;Tools and platforms
&lt;/h2&gt;&lt;h3 id="langchain"&gt;LangChain
&lt;/h3&gt;&lt;p&gt;A framework for building LLM-powered applications. Useful for orchestrating multi-step chains (e.g., retrieve context, format prompt, call LLM, parse output). Supports many LLM providers and has good tooling for RAG pipelines.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain.chat_models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;Review this code diff for security issues and suggest fixes:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{diff}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;gpt-4o&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;diff&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;code_diff&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h3 id="vllm"&gt;vLLM
&lt;/h3&gt;&lt;p&gt;A high-throughput inference engine for self-hosted models. Supports PagedAttention for efficient memory management and continuous batching for high throughput.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Start a vLLM server&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;python -m vllm.entrypoints.openai.api_server &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --model mistralai/Mistral-7B-Instruct-v0.2 &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --port &lt;span class="m"&gt;8000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Exposes an OpenAI-compatible API, so you can swap between self-hosted and external APIs with minimal code changes.&lt;/p&gt;
&lt;h3 id="ollama"&gt;Ollama
&lt;/h3&gt;&lt;p&gt;The easiest way to run LLMs locally for development and testing. Great for prototyping pipelines before committing to infrastructure.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Pull and run a model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ollama pull llama3
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ollama run llama3 &lt;span class="s2"&gt;&amp;#34;Summarize this error log: [paste log]&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Serve as an API&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ollama serve
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Then call http://localhost:11434/api/generate&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id="example-automated-pr-review-pipeline"&gt;Example: Automated PR review pipeline
&lt;/h2&gt;&lt;p&gt;Here is a conceptual pipeline for automated PR review using an LLM:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;span class="lnt"&gt;22
&lt;/span&gt;&lt;span class="lnt"&gt;23
&lt;/span&gt;&lt;span class="lnt"&gt;24
&lt;/span&gt;&lt;span class="lnt"&gt;25
&lt;/span&gt;&lt;span class="lnt"&gt;26
&lt;/span&gt;&lt;span class="lnt"&gt;27
&lt;/span&gt;&lt;span class="lnt"&gt;28
&lt;/span&gt;&lt;span class="lnt"&gt;29
&lt;/span&gt;&lt;span class="lnt"&gt;30
&lt;/span&gt;&lt;span class="lnt"&gt;31
&lt;/span&gt;&lt;span class="lnt"&gt;32
&lt;/span&gt;&lt;span class="lnt"&gt;33
&lt;/span&gt;&lt;span class="lnt"&gt;34
&lt;/span&gt;&lt;span class="lnt"&gt;35
&lt;/span&gt;&lt;span class="lnt"&gt;36
&lt;/span&gt;&lt;span class="lnt"&gt;37
&lt;/span&gt;&lt;span class="lnt"&gt;38
&lt;/span&gt;&lt;span class="lnt"&gt;39
&lt;/span&gt;&lt;span class="lnt"&gt;40
&lt;/span&gt;&lt;span class="lnt"&gt;41
&lt;/span&gt;&lt;span class="lnt"&gt;42
&lt;/span&gt;&lt;span class="lnt"&gt;43
&lt;/span&gt;&lt;span class="lnt"&gt;44
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# .github/workflows/llm-review.yml&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;LLM Code Review&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;pull_request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;types&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="l"&gt;opened, synchronize]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;llm-review&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;runs-on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;ubuntu-latest&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Checkout&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;actions/checkout@v4&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;with&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;fetch-depth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Get diff&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;diff&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="sd"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; git diff origin/${{ github.base_ref }}...HEAD &amp;gt; diff.txt&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Run LLM review&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;LLM_API_KEY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;${{ secrets.LLM_API_KEY }}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="sd"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; python scripts/llm_review.py \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; --diff diff.txt \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; --model gpt-4o \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; --max-tokens 2000 \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; --output review.json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Post review comments&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;actions/github-script@v7&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;with&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;script&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="sd"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; const review = require(&amp;#39;./review.json&amp;#39;);
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; await github.rest.pulls.createReview({
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; owner: context.repo.owner,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; repo: context.repo.repo,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; pull_number: context.issue.number,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; body: review.summary,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; event: &amp;#39;COMMENT&amp;#39;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; comments: review.line_comments
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; });&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The review script would:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Read the diff&lt;/li&gt;
&lt;li&gt;Split large diffs into chunks that fit within the model&amp;rsquo;s context window&lt;/li&gt;
&lt;li&gt;For each chunk, construct a prompt asking for security issues, bugs, and style problems&lt;/li&gt;
&lt;li&gt;Aggregate results and format as GitHub review comments&lt;/li&gt;
&lt;li&gt;Include confidence scores and always mark output as AI-generated&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="guardrails-and-responsible-use"&gt;Guardrails and responsible use
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Label all LLM output clearly&lt;/strong&gt; as AI-generated. Engineers should know when they are reading machine output.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Never auto-merge or auto-apply&lt;/strong&gt; LLM suggestions. Keep a human in the loop for all changes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Log all prompts and responses&lt;/strong&gt; for debugging and audit purposes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Set spending limits&lt;/strong&gt; and alerts on LLM API usage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Review prompt templates regularly&lt;/strong&gt; to ensure they do not leak sensitive information.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Test for bias and errors&lt;/strong&gt; with representative samples before deploying to production workflows.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="getting-started-recommendations"&gt;Getting started recommendations
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Pick one use case&lt;/strong&gt; - Don&amp;rsquo;t try to LLM-enable everything at once. Start low-risk: documentation drafts, commit message suggestions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Start with an external API&lt;/strong&gt; - Don&amp;rsquo;t invest in GPU infrastructure until you&amp;rsquo;ve validated the use case. Use OpenAI or Anthropic to prototype.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Measure everything&lt;/strong&gt; - Track cost per invocation, latency, user satisfaction, error rates from day one.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Build an evaluation framework&lt;/strong&gt; - Create a test suite of known-good inputs and expected outputs. Run it against every prompt change or model update.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Plan your data strategy&lt;/strong&gt; - Decide early what data you&amp;rsquo;ll and won&amp;rsquo;t send to external APIs. Document clearly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Iterate on prompts&lt;/strong&gt; - Prompt engineering is iterative. Version control prompts, treat as code.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;LLMs are a powerful tool for DevOps automation, but they&amp;rsquo;re exactly that: a tool. They work best when thoughtfully integrated into existing workflows, with clear boundaries on what they can and cannot do autonomously.&lt;/p&gt;</description></item><item><title>DevSecOps maturity model</title><link>https://adurrr.github.io/en/p/devsecops-maturity-model/</link><pubDate>Sun, 08 Oct 2023 10:00:00 +0100</pubDate><guid>https://adurrr.github.io/en/p/devsecops-maturity-model/</guid><description>&lt;h2 id="why-a-maturity-model-helps"&gt;Why a Maturity Model Helps
&lt;/h2&gt;&lt;p&gt;Most teams know they should &amp;ldquo;shift security left,&amp;rdquo; but knowing where to start is the hard part. A maturity model gives you a structured way to assess your current state, identify gaps, and plan a realistic roadmap for improvement.&lt;/p&gt;
&lt;p&gt;Without a model, security improvements tend to be reactive (triggered by incidents or audit findings rather than deliberate planning). A maturity model turns security from a fire drill into an engineering discipline with measurable progress.&lt;/p&gt;
&lt;p&gt;The model described here has five levels. The goal is not to rush to the highest level but to make steady, sustainable progress. Each level builds on the previous one.&lt;/p&gt;
&lt;h2 id="the-five-maturity-levels"&gt;The Five Maturity Levels
&lt;/h2&gt;&lt;h3 id="level-1-ad-hoc"&gt;Level 1: Ad-Hoc
&lt;/h3&gt;&lt;p&gt;At this level, security is an afterthought. There are no formal processes, and security activities happen sporadically if at all.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No security testing in CI/CD pipelines.&lt;/li&gt;
&lt;li&gt;Vulnerabilities discovered in production or by external parties.&lt;/li&gt;
&lt;li&gt;No dedicated security tooling.&lt;/li&gt;
&lt;li&gt;Developers have little to no security training.&lt;/li&gt;
&lt;li&gt;Incident response is improvised.&lt;/li&gt;
&lt;li&gt;Compliance is addressed manually before audits.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Typical tools:&lt;/strong&gt; None specifically for security. Maybe a firewall and antivirus.&lt;/p&gt;
&lt;h3 id="level-2-reactive"&gt;Level 2: Reactive
&lt;/h3&gt;&lt;p&gt;Security is recognized as important, but the approach is reactive. The team responds to vulnerabilities and incidents but doesn&amp;rsquo;t proactively prevent them.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Basic static analysis (SAST) runs occasionally, but findings are not always addressed.&lt;/li&gt;
&lt;li&gt;Dependency scanning is done manually or on an ad-hoc basis.&lt;/li&gt;
&lt;li&gt;There&amp;rsquo;s some security documentation, but it&amp;rsquo;s outdated.&lt;/li&gt;
&lt;li&gt;Incident response exists as a documented process, though it&amp;rsquo;s rarely practiced.&lt;/li&gt;
&lt;li&gt;Security reviews happen late in the development cycle (right before release).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Typical tools:&lt;/strong&gt; SonarQube (basic rules), OWASP Dependency-Check, manual penetration testing.&lt;/p&gt;
&lt;h3 id="level-3-proactive"&gt;Level 3: Proactive
&lt;/h3&gt;&lt;p&gt;Security is integrated into the development workflow. The team actively seeks to prevent vulnerabilities rather than just reacting to them.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;SAST and DAST run automatically in CI/CD pipelines.&lt;/li&gt;
&lt;li&gt;Dependency scanning with automated alerts for known vulnerabilities.&lt;/li&gt;
&lt;li&gt;Container image scanning before deployment (Trivy, Grype).&lt;/li&gt;
&lt;li&gt;Infrastructure as Code is scanned for misconfigurations (Checkov, tfsec).&lt;/li&gt;
&lt;li&gt;Threat modeling is performed for new features and architecture changes.&lt;/li&gt;
&lt;li&gt;Security champions exist within development teams.&lt;/li&gt;
&lt;li&gt;Blameless postmortems are conducted after security incidents.&lt;/li&gt;
&lt;li&gt;Regular security training for developers.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Typical tools:&lt;/strong&gt; Semgrep, Trivy, Checkov, OWASP ZAP, HashiCorp Vault, Falco.&lt;/p&gt;
&lt;h3 id="level-4-optimized"&gt;Level 4: Optimized
&lt;/h3&gt;&lt;p&gt;Security is deeply embedded in every stage of the software lifecycle. Metrics drive decisions, and the team continuously improves based on data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Security gates in pipelines that block deployment if critical issues are found.&lt;/li&gt;
&lt;li&gt;Mean time to remediate (MTTR) is tracked and continuously reduced.&lt;/li&gt;
&lt;li&gt;Software Bill of Materials (SBOM) generated for every release.&lt;/li&gt;
&lt;li&gt;Signed artifacts and verified supply chain.&lt;/li&gt;
&lt;li&gt;Automated compliance checks mapped to frameworks (SOC2, ISO 27001, PCI-DSS).&lt;/li&gt;
&lt;li&gt;Runtime security monitoring with automated response (Falco + custom rules).&lt;/li&gt;
&lt;li&gt;Regular red team exercises and chaos engineering for security.&lt;/li&gt;
&lt;li&gt;Security metrics are part of engineering dashboards.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Typical tools:&lt;/strong&gt; Sigstore/cosign, OPA/Gatekeeper, Kyverno, SIEM integration, automated compliance platforms.&lt;/p&gt;
&lt;h3 id="level-5-innovative"&gt;Level 5: Innovative
&lt;/h3&gt;&lt;p&gt;Security is a competitive advantage. The team contributes to the broader security community and pushes the state of the art.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What it looks like:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Bug bounty programs actively managed.&lt;/li&gt;
&lt;li&gt;Custom security tooling developed for organization-specific risks.&lt;/li&gt;
&lt;li&gt;Machine learning applied to anomaly detection and threat hunting.&lt;/li&gt;
&lt;li&gt;Security is a feature sold to customers (certifications, transparency reports).&lt;/li&gt;
&lt;li&gt;Active participation in open-source security projects.&lt;/li&gt;
&lt;li&gt;Zero-trust architecture fully implemented.&lt;/li&gt;
&lt;li&gt;Policy as code governs all infrastructure and application security.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Typical tools:&lt;/strong&gt; Custom-built platforms, eBPF-based security tools, advanced SIEM with ML, zero-trust service mesh.&lt;/p&gt;
&lt;h2 id="key-dimensions"&gt;Key Dimensions
&lt;/h2&gt;&lt;p&gt;A maturity model isn&amp;rsquo;t one-dimensional. Assess your organization across these dimensions, as progress is rarely uniform:&lt;/p&gt;
&lt;h3 id="code-security"&gt;Code Security
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Level&lt;/th&gt;
 &lt;th&gt;Practices&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Ad-Hoc&lt;/td&gt;
 &lt;td&gt;No code scanning&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Reactive&lt;/td&gt;
 &lt;td&gt;Occasional SAST, manual code reviews for security&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Proactive&lt;/td&gt;
 &lt;td&gt;Automated SAST/DAST in CI, security-focused code review guidelines&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Optimized&lt;/td&gt;
 &lt;td&gt;Custom rules for organization-specific patterns, MTTR tracked&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Innovative&lt;/td&gt;
 &lt;td&gt;AI-assisted code review, automatic fix suggestions&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="infrastructure-security"&gt;Infrastructure Security
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Level&lt;/th&gt;
 &lt;th&gt;Practices&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Ad-Hoc&lt;/td&gt;
 &lt;td&gt;Manual server configuration, no hardening standards&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Reactive&lt;/td&gt;
 &lt;td&gt;Basic hardening checklists, occasional audits&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Proactive&lt;/td&gt;
 &lt;td&gt;IaC scanning, automated hardening, CIS benchmarks&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Optimized&lt;/td&gt;
 &lt;td&gt;Policy as code (OPA), drift detection, automated remediation&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Innovative&lt;/td&gt;
 &lt;td&gt;Self-healing infrastructure, zero-trust networking&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="monitoring-and-detection"&gt;Monitoring and Detection
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Level&lt;/th&gt;
 &lt;th&gt;Practices&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Ad-Hoc&lt;/td&gt;
 &lt;td&gt;No security monitoring&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Reactive&lt;/td&gt;
 &lt;td&gt;Basic log collection, manual review after incidents&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Proactive&lt;/td&gt;
 &lt;td&gt;Centralized logging, alerting on known patterns, runtime monitoring&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Optimized&lt;/td&gt;
 &lt;td&gt;SIEM with correlation rules, automated response playbooks&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Innovative&lt;/td&gt;
 &lt;td&gt;ML-based anomaly detection, threat hunting programs&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="incident-response"&gt;Incident Response
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Level&lt;/th&gt;
 &lt;th&gt;Practices&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Ad-Hoc&lt;/td&gt;
 &lt;td&gt;No process, ad-hoc response&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Reactive&lt;/td&gt;
 &lt;td&gt;Documented runbooks, rarely tested&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Proactive&lt;/td&gt;
 &lt;td&gt;Regular tabletop exercises, blameless postmortems, on-call rotation&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Optimized&lt;/td&gt;
 &lt;td&gt;Automated incident classification, SLA-driven response times&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Innovative&lt;/td&gt;
 &lt;td&gt;Chaos engineering for security, automated containment&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="compliance"&gt;Compliance
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Level&lt;/th&gt;
 &lt;th&gt;Practices&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;Ad-Hoc&lt;/td&gt;
 &lt;td&gt;Manual evidence collection before audits&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Reactive&lt;/td&gt;
 &lt;td&gt;Spreadsheet-based tracking, periodic reviews&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Proactive&lt;/td&gt;
 &lt;td&gt;Automated evidence collection, continuous monitoring&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Optimized&lt;/td&gt;
 &lt;td&gt;Compliance as code, real-time dashboards, automated reporting&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;Innovative&lt;/td&gt;
 &lt;td&gt;Continuous certification, public transparency reports&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="self-assessment-checklist"&gt;Self-Assessment Checklist
&lt;/h2&gt;&lt;p&gt;Rate your organization on each item (Yes / Partial / No):&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Build Phase:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; SAST runs automatically on every pull request.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Dependency scanning alerts on known CVEs.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Container images are scanned before being pushed to a registry.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; IaC templates are scanned for misconfigurations.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Secrets detection prevents credentials from being committed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Deploy Phase:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Security gates can block deployment for critical findings.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Artifacts are signed and signatures are verified.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; SBOM is generated for every release.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Infrastructure changes go through policy-as-code validation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Run Phase:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Runtime security monitoring is active (Falco, Sysdig, etc.).&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Centralized logging with security-relevant alerts.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Network segmentation limits blast radius.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Secrets are managed through a dedicated vault.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Culture and Process:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Developers receive regular security training.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Security champions are embedded in development teams.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Blameless postmortems are conducted after incidents.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Threat modeling is part of the design process for new features.&lt;/li&gt;
&lt;li&gt;&lt;input disabled="" type="checkbox"&gt; Security metrics are tracked and reviewed regularly.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="roadmap-for-progression"&gt;Roadmap for Progression
&lt;/h2&gt;&lt;p&gt;Moving up the maturity levels doesn&amp;rsquo;t happen overnight. Here&amp;rsquo;s a practical roadmap:&lt;/p&gt;
&lt;h3 id="from-ad-hoc-to-reactive-3-6-months"&gt;From Ad-Hoc to Reactive (3-6 months)
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Add a SAST tool to your CI pipeline (start with Semgrep - it has good defaults and is fast).&lt;/li&gt;
&lt;li&gt;Enable dependency scanning (GitHub Dependabot, or &lt;code&gt;trivy fs&lt;/code&gt; in CI).&lt;/li&gt;
&lt;li&gt;Document your incident response process, even if it&amp;rsquo;s simple.&lt;/li&gt;
&lt;li&gt;Run a single security training session for the team.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="from-reactive-to-proactive-6-12-months"&gt;From Reactive to Proactive (6-12 months)
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Add container image scanning and IaC scanning to pipelines.&lt;/li&gt;
&lt;li&gt;Implement secrets detection in pre-commit hooks (&lt;code&gt;gitleaks&lt;/code&gt;, &lt;code&gt;detect-secrets&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Appoint security champions in each team.&lt;/li&gt;
&lt;li&gt;Start threat modeling for major features.&lt;/li&gt;
&lt;li&gt;Conduct your first blameless postmortem after an incident.&lt;/li&gt;
&lt;li&gt;Deploy runtime monitoring (Falco).&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="from-proactive-to-optimized-12-18-months"&gt;From Proactive to Optimized (12-18 months)
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Implement security gates that can block deployments.&lt;/li&gt;
&lt;li&gt;Track MTTR and set reduction targets.&lt;/li&gt;
&lt;li&gt;Generate SBOMs and sign artifacts.&lt;/li&gt;
&lt;li&gt;Implement policy-as-code for infrastructure (OPA/Gatekeeper).&lt;/li&gt;
&lt;li&gt;Map automated checks to compliance frameworks.&lt;/li&gt;
&lt;li&gt;Integrate security metrics into engineering dashboards.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="from-optimized-to-innovative-18-months"&gt;From Optimized to Innovative (18+ months)
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Launch a bug bounty program.&lt;/li&gt;
&lt;li&gt;Build custom security tooling for organization-specific risks.&lt;/li&gt;
&lt;li&gt;Implement zero-trust architecture.&lt;/li&gt;
&lt;li&gt;Run regular red team exercises.&lt;/li&gt;
&lt;li&gt;Contribute to open-source security projects.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="cultural-aspects"&gt;Cultural Aspects
&lt;/h2&gt;&lt;p&gt;Tools and processes are necessary but insufficient. Culture determines whether security practices actually stick.&lt;/p&gt;
&lt;h3 id="blameless-postmortems"&gt;Blameless Postmortems
&lt;/h3&gt;&lt;p&gt;When a security incident occurs, the instinct is often to find someone to blame. This drives people to hide mistakes and cover up near-misses. Blameless postmortems flip this around: they focus on systemic failures and process improvements rather than individual fault. The question changes from &amp;ldquo;who made this mistake?&amp;rdquo; to &amp;ldquo;what allowed this mistake to happen, and how do we prevent it?&amp;rdquo;&lt;/p&gt;
&lt;h3 id="security-champions"&gt;Security Champions
&lt;/h3&gt;&lt;p&gt;A security champion is a developer who takes on extra responsibility for security within their team. They are not full-time security engineers &amp;mdash; they are developers who act as a bridge between the security team and the development team. Their role includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Reviewing security-relevant pull requests.&lt;/li&gt;
&lt;li&gt;Staying current on security topics and sharing knowledge.&lt;/li&gt;
&lt;li&gt;Participating in threat modeling sessions.&lt;/li&gt;
&lt;li&gt;Being the first point of contact for security questions.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This model scales far better than having a central security team review everything.&lt;/p&gt;
&lt;h3 id="making-security-easy"&gt;Making Security Easy
&lt;/h3&gt;&lt;p&gt;If security practices are painful, people will find workarounds. The goal is to make security the easiest path:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Provide secure templates and starter projects.&lt;/li&gt;
&lt;li&gt;Automate as much as possible so developers don&amp;rsquo;t have to remember manual steps.&lt;/li&gt;
&lt;li&gt;Give fast feedback. A SAST scan that takes 30 minutes will be ignored; one that takes 30 seconds will be used.&lt;/li&gt;
&lt;li&gt;Celebrate security improvements just as you celebrate feature delivery.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="conclusion"&gt;Conclusion
&lt;/h2&gt;&lt;p&gt;A DevSecOps maturity model is a compass, not a destination. The value comes from honest self-assessment, setting realistic goals, and making steady progress. Start where you are, pick the dimension where improvement will have the most impact, and build from there. Security is a team sport. The best security cultures are built incrementally, one practice at a time.&lt;/p&gt;</description></item><item><title>Introduction to AIOps: intelligent IT operations</title><link>https://adurrr.github.io/en/p/introduction-to-aiops-intelligent-it-operations/</link><pubDate>Mon, 05 Dec 2022 00:00:00 +0000</pubDate><guid>https://adurrr.github.io/en/p/introduction-to-aiops-intelligent-it-operations/</guid><description>&lt;h2 id="what-is-aiops"&gt;What is AIOps?
&lt;/h2&gt;&lt;p&gt;AIOps (Artificial Intelligence for IT Operations) applies machine learning and data analytics to operational data (logs, metrics, events, traces) to automate and improve workflows. Gartner coined the term in 2017, but the idea is simple: use algorithms to handle the volume and complexity that humans can&amp;rsquo;t manage manually.&lt;/p&gt;
&lt;p&gt;In practical terms, AIOps platforms ingest data from monitoring tools, APM systems, log aggregators, and event sources. They apply ML models to detect anomalies, correlate events, identify root causes, and in some cases trigger automated remediation. The goal is to reduce mean time to detection (MTTD) and mean time to resolution (MTTR) while freeing operations teams from alert fatigue.&lt;/p&gt;
&lt;h2 id="why-traditional-monitoring-falls-short"&gt;Why traditional monitoring falls short
&lt;/h2&gt;&lt;p&gt;Monitoring used to work fine. You had a few servers, a handful of apps, and a limited set of metrics to watch. A static CPU threshold or log regex was enough.&lt;/p&gt;
&lt;p&gt;Modern infrastructure broke that model:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Scale&lt;/strong&gt;: A medium Kubernetes cluster generates millions of metrics and logs per minute. You can&amp;rsquo;t humanly watch dashboards at that scale.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Complexity&lt;/strong&gt;: Microservices create tangled dependency graphs. One user request might touch dozens of services. Finding what caused a latency spike means correlating data across all of them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dynamic environments&lt;/strong&gt;: Auto-scaling, ephemeral containers, and serverless functions mean baselines constantly shift. Static thresholds explode with false positives.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Alert fatigue&lt;/strong&gt;: Teams get buried in alerts. When 90% is noise, that critical 10% disappears. Engineers start ignoring everything.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;AIOps doesn&amp;rsquo;t replace monitoring. It layers on top of what you already have and makes it smarter.&lt;/p&gt;
&lt;h2 id="key-capabilities"&gt;Key capabilities
&lt;/h2&gt;&lt;h3 id="1-anomaly-detection"&gt;1. Anomaly detection
&lt;/h3&gt;&lt;p&gt;Instead of static thresholds, AIOps uses ML models (often time-series analysis, clustering, or autoencoders) to learn what &amp;ldquo;normal&amp;rdquo; looks like for each metric and service. When behavior deviates significantly from the learned baseline, an anomaly is flagged.&lt;/p&gt;
&lt;p&gt;This handles the dynamic baseline problem. If your application normally sees a traffic spike every Monday at 9 AM, the model learns that pattern and does not alert on it. But an unexpected spike at 3 AM on a Wednesday gets flagged.&lt;/p&gt;
&lt;h3 id="2-event-correlation"&gt;2. Event correlation
&lt;/h3&gt;&lt;p&gt;A single infrastructure issue can generate hundreds or thousands of related alerts across different monitoring tools. AIOps correlates these events — grouping them by time, topology, and causal relationships — to present a single incident instead of a wall of alerts.&lt;/p&gt;
&lt;p&gt;For example, a network switch failure might trigger alerts on: the switch itself, all connected servers (connectivity lost), all applications on those servers (health check failures), and downstream services (timeout errors). An AIOps platform correlates all of these into one incident: &amp;ldquo;Network switch X failed.&amp;rdquo;&lt;/p&gt;
&lt;h3 id="3-root-cause-analysis"&gt;3. Root cause analysis
&lt;/h3&gt;&lt;p&gt;Beyond correlation, AIOps attempts to identify the root cause of an incident. By understanding the topology of your infrastructure and the causal chain of events, it can suggest that the network switch failure is the root cause, rather than presenting the application timeout as an independent issue.&lt;/p&gt;
&lt;p&gt;This is where the value becomes tangible. Instead of an on-call engineer spending 30 minutes tracing through dashboards and logs, the platform surfaces the probable root cause immediately.&lt;/p&gt;
&lt;h3 id="4-auto-remediation"&gt;4. Auto-remediation
&lt;/h3&gt;&lt;p&gt;The most mature AIOps implementations close the loop by triggering automated remediation actions. If a known pattern is detected (disk filling up, a pod in CrashLoopBackOff, a runaway process consuming memory), the platform can execute predefined runbooks automatically.&lt;/p&gt;
&lt;p&gt;Examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Restart a crashed pod or service.&lt;/li&gt;
&lt;li&gt;Scale up a deployment when anomalous load is detected.&lt;/li&gt;
&lt;li&gt;Clear a log directory when disk usage exceeds a dynamic threshold.&lt;/li&gt;
&lt;li&gt;Trigger a failover when a primary database becomes unresponsive.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Auto-remediation requires careful design. Start with low-risk actions and expand as confidence grows.&lt;/p&gt;
&lt;h2 id="common-platforms-and-tools"&gt;Common platforms and tools
&lt;/h2&gt;&lt;p&gt;The AIOps landscape includes both commercial platforms and open-source building blocks:&lt;/p&gt;
&lt;h3 id="commercial-platforms"&gt;Commercial platforms
&lt;/h3&gt;&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Platform&lt;/th&gt;
 &lt;th&gt;Strengths&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Dynatrace&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Strong auto-discovery, AI engine (Davis), full-stack observability&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Datadog&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Unified monitoring + ML-powered alerting, Watchdog anomaly detection&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Splunk ITSI&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Powerful log analytics + ML toolkit, good for event correlation&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Moogsoft&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Pioneered AIOps space, strong event correlation and noise reduction&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;BigPanda&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Event correlation and automation focused, integrates with existing tools&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;PagerDuty&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Incident management with ML-driven noise reduction and smart grouping&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="open-source-building-blocks"&gt;Open-source building blocks
&lt;/h3&gt;&lt;p&gt;You can assemble an AIOps-like stack from open-source components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Data collection&lt;/strong&gt;: Prometheus, Grafana Agent, OpenTelemetry Collector, Fluentd/Fluent Bit.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data storage&lt;/strong&gt;: Prometheus (metrics), Elasticsearch/OpenSearch (logs), Jaeger/Tempo (traces).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Anomaly detection&lt;/strong&gt;: Facebook Prophet, Isolation Forest (scikit-learn), luminol, Grafana ML.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Event correlation&lt;/strong&gt;: Custom logic on top of event streams, or StackStorm for event-driven automation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Alerting and automation&lt;/strong&gt;: Alertmanager, Grafana OnCall, StackStorm, Rundeck.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Building a custom AIOps stack is significantly more work than using a commercial platform, but it gives you full control and avoids vendor lock-in. A reasonable middle ground is using a commercial platform for core AIOps capabilities while keeping your data pipeline open-source.&lt;/p&gt;
&lt;h2 id="practical-use-cases"&gt;Practical use cases
&lt;/h2&gt;&lt;h3 id="noise-reduction-in-alert-management"&gt;Noise reduction in alert management
&lt;/h3&gt;&lt;p&gt;A team receiving 500+ alerts per day implements AIOps event correlation. Related alerts are grouped into incidents, duplicates are suppressed, and flapping alerts are silenced. Alert volume drops by 80%, and the on-call engineer can focus on actual incidents.&lt;/p&gt;
&lt;h3 id="proactive-capacity-planning"&gt;Proactive capacity planning
&lt;/h3&gt;&lt;p&gt;AIOps models analyze historical resource usage trends and predict when capacity limits will be reached. Instead of reacting to a disk-full alert at 2 AM, the platform predicts the issue two weeks in advance and creates a ticket for the team to address during business hours.&lt;/p&gt;
&lt;h3 id="faster-incident-response"&gt;Faster incident response
&lt;/h3&gt;&lt;p&gt;During a production outage, the AIOps platform correlates alerts across the monitoring stack, identifies the root cause (a recent deployment that introduced a memory leak), and surfaces the relevant deployment commit. MTTR drops from 45 minutes to 10 minutes.&lt;/p&gt;
&lt;h3 id="automated-scaling"&gt;Automated scaling
&lt;/h3&gt;&lt;p&gt;The platform detects anomalous traffic patterns that deviate from the learned baseline. Instead of waiting for CPU to hit 80% (the static threshold), it triggers a scale-up action based on the rate of change, ensuring capacity is ready before users experience degradation.&lt;/p&gt;
&lt;h2 id="how-aiops-fits-into-devops-workflows"&gt;How AIOps fits into DevOps workflows
&lt;/h2&gt;&lt;p&gt;AIOps is not a replacement for DevOps practices. It is an enhancement layer:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Code ──&amp;gt; CI/CD Pipeline ──&amp;gt; Deploy ──&amp;gt; Observe ──&amp;gt; AIOps Layer ──&amp;gt; Act
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; │ │
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; Monitoring Stack ML Models
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; (metrics, logs, (anomaly detection,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; traces, events) correlation, RCA)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Developers&lt;/strong&gt; benefit from faster root cause identification when their code causes issues in production.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Operations&lt;/strong&gt; teams benefit from noise reduction, automated remediation, and proactive alerting.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SRE teams&lt;/strong&gt; benefit from data-driven SLO tracking and error budget burn rate analysis.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;AIOps works best when your observability foundation is solid. If you are not collecting good data (structured logs, meaningful metrics, distributed traces), ML models will not produce meaningful insights. Fix your observability first, then layer AIOps on top.&lt;/p&gt;
&lt;h2 id="getting-started-a-pragmatic-path"&gt;Getting started: A pragmatic path
&lt;/h2&gt;&lt;p&gt;If AIOps sounds useful, here&amp;rsquo;s a practical approach:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Audit your current observability stack.&lt;/strong&gt; What data are you collecting? Do you have structured logs? Consistently labeled metrics? Traces across services? AIOps can only work with good data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Start with noise reduction.&lt;/strong&gt; This is the lowest-hanging fruit. Implement alert grouping and deduplication. Even basic rules-based correlation (before any ML) will reduce alert fatigue significantly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Add anomaly detection to key metrics.&lt;/strong&gt; Pick 3-5 critical business and infrastructure metrics. Apply a time-series anomaly detection model. Facebook Prophet or Prometheus recording rules with seasonal adjustments are good starting points.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Implement automated remediation for known issues.&lt;/strong&gt; Identify the top 5 recurring incidents. Write runbooks for them. Automate the runbooks using StackStorm, Rundeck, or your platform&amp;rsquo;s automation engine.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Evaluate a commercial platform when complexity demands it.&lt;/strong&gt; If you have hundreds of services, multiple monitoring tools, and a growing operations team, the investment in a commercial AIOps platform may be justified by the reduction in MTTR alone.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Measure the impact.&lt;/strong&gt; Track MTTD, MTTR, alert-to-incident ratio, and false positive rate. Without metrics, you can&amp;rsquo;t prove AIOps is worth the investment.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;AIOps isn&amp;rsquo;t magic. It&amp;rsquo;s a set of techniques that, applied to solid operational data, can reduce the burden on ops teams and improve reliability. Start small, measure everything, and scale what actually works.&lt;/p&gt;</description></item><item><title>Infrastructure as code with Terraform: a practical guide</title><link>https://adurrr.github.io/en/p/infrastructure-as-code-with-terraform-a-practical-guide/</link><pubDate>Sat, 10 Sep 2022 00:00:00 +0000</pubDate><guid>https://adurrr.github.io/en/p/infrastructure-as-code-with-terraform-a-practical-guide/</guid><description>&lt;h2 id="why-infrastructure-as-code-matters"&gt;Why infrastructure as code matters
&lt;/h2&gt;&lt;p&gt;Managing infrastructure manually through web consoles or ad-hoc scripts creates problems that pile up over time: inconsistent environments, undocumented changes, impossible rollbacks, and the classic &amp;ldquo;it works on my machine&amp;rdquo; extended to entire servers.&lt;/p&gt;
&lt;p&gt;Infrastructure as Code (IaC) fixes this by treating infrastructure like application code: it&amp;rsquo;s written, versioned, reviewed, tested, and applied through automated workflows. The benefits show up right away:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Reproducibility&lt;/strong&gt;: Spin up identical environments in minutes, not days.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Version control&lt;/strong&gt;: Every infrastructure change goes through a PR with code review.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Documentation by default&lt;/strong&gt;: The code &lt;em&gt;is&lt;/em&gt; the documentation of what your infrastructure looks like.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Disaster recovery&lt;/strong&gt;: Rebuild everything from code if a region goes down.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost visibility&lt;/strong&gt;: Review infrastructure changes before they are applied (and before they start costing money).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="terraform-vs-other-tools"&gt;Terraform vs other tools
&lt;/h2&gt;&lt;p&gt;Several IaC tools exist. Here&amp;rsquo;s how Terraform compares to the main alternatives:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
 &lt;tr&gt;
 &lt;th&gt;Feature&lt;/th&gt;
 &lt;th&gt;Terraform&lt;/th&gt;
 &lt;th&gt;Pulumi&lt;/th&gt;
 &lt;th&gt;CloudFormation&lt;/th&gt;
 &lt;th&gt;Ansible&lt;/th&gt;
 &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tbody&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Language&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;HCL (declarative)&lt;/td&gt;
 &lt;td&gt;Python, TypeScript, Go, etc.&lt;/td&gt;
 &lt;td&gt;JSON/YAML&lt;/td&gt;
 &lt;td&gt;YAML (procedural)&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Cloud support&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Multi-cloud&lt;/td&gt;
 &lt;td&gt;Multi-cloud&lt;/td&gt;
 &lt;td&gt;AWS only&lt;/td&gt;
 &lt;td&gt;Multi-cloud (via modules)&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;State management&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Explicit state file&lt;/td&gt;
 &lt;td&gt;Managed by Pulumi service&lt;/td&gt;
 &lt;td&gt;Managed by AWS&lt;/td&gt;
 &lt;td&gt;Stateless&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Learning curve&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Moderate&lt;/td&gt;
 &lt;td&gt;Varies by language&lt;/td&gt;
 &lt;td&gt;Moderate&lt;/td&gt;
 &lt;td&gt;Low&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Ecosystem&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Huge provider ecosystem&lt;/td&gt;
 &lt;td&gt;Growing&lt;/td&gt;
 &lt;td&gt;AWS-only but deep&lt;/td&gt;
 &lt;td&gt;Huge role ecosystem&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
 &lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
 &lt;td&gt;Multi-cloud infra&lt;/td&gt;
 &lt;td&gt;Teams that prefer general-purpose languages&lt;/td&gt;
 &lt;td&gt;AWS-only shops&lt;/td&gt;
 &lt;td&gt;Configuration management&lt;/td&gt;
 &lt;/tr&gt;
 &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Terraform&amp;rsquo;s sweet spot is multi-cloud infrastructure provisioning with a declarative approach. If you&amp;rsquo;re on AWS only and want tight integration, CloudFormation is reasonable. If your team prefers writing Python over HCL, Pulumi deserves a look. But for most teams managing infrastructure across providers, Terraform is the pragmatic choice.&lt;/p&gt;
&lt;h2 id="core-concepts"&gt;Core concepts
&lt;/h2&gt;&lt;h3 id="providers"&gt;Providers
&lt;/h3&gt;&lt;p&gt;Providers are plugins that let Terraform interact with APIs — AWS, Azure, GCP, Kubernetes, GitHub, Cloudflare, and hundreds more.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-hcl" data-lang="hcl"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;terraform&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;required_providers&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; aws&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;hashicorp/aws&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;~&amp;gt; 5.0&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;eu-west-1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h3 id="resources"&gt;Resources
&lt;/h3&gt;&lt;p&gt;Resources are the fundamental building blocks. Each resource block describes one infrastructure object.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-hcl" data-lang="hcl"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_instance&amp;#34; &amp;#34;web&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; ami&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;ami-0c55b159cbfafe1f0&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; instance_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;t3.micro&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;web-server&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h3 id="state"&gt;State
&lt;/h3&gt;&lt;p&gt;Terraform maintains a &lt;strong&gt;state file&lt;/strong&gt; that maps your configuration to real-world resources. This is how Terraform knows what exists, what needs to change, and what to destroy. The state file is critical. Losing it means Terraform loses track of your infrastructure.&lt;/p&gt;
&lt;h3 id="modules"&gt;Modules
&lt;/h3&gt;&lt;p&gt;Modules are reusable packages of Terraform configuration. Think of them as functions: they take inputs (variables), create resources, and produce outputs.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-hcl" data-lang="hcl"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;vpc&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;terraform-aws-modules/vpc/aws&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;5.1.0&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;my-vpc&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; cidr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;10.0.0.0/16&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; azs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;eu-west-1a&amp;#34;, &amp;#34;eu-west-1b&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; private_subnets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;10.0.1.0/24&amp;#34;, &amp;#34;10.0.2.0/24&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; public_subnets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;10.0.101.0/24&amp;#34;, &amp;#34;10.0.102.0/24&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; enable_nat_gateway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id="practical-example-vpc--ec2"&gt;Practical example: VPC + EC2
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s a complete example that provisions a VPC with a public subnet and an EC2 instance:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt; 10
&lt;/span&gt;&lt;span class="lnt"&gt; 11
&lt;/span&gt;&lt;span class="lnt"&gt; 12
&lt;/span&gt;&lt;span class="lnt"&gt; 13
&lt;/span&gt;&lt;span class="lnt"&gt; 14
&lt;/span&gt;&lt;span class="lnt"&gt; 15
&lt;/span&gt;&lt;span class="lnt"&gt; 16
&lt;/span&gt;&lt;span class="lnt"&gt; 17
&lt;/span&gt;&lt;span class="lnt"&gt; 18
&lt;/span&gt;&lt;span class="lnt"&gt; 19
&lt;/span&gt;&lt;span class="lnt"&gt; 20
&lt;/span&gt;&lt;span class="lnt"&gt; 21
&lt;/span&gt;&lt;span class="lnt"&gt; 22
&lt;/span&gt;&lt;span class="lnt"&gt; 23
&lt;/span&gt;&lt;span class="lnt"&gt; 24
&lt;/span&gt;&lt;span class="lnt"&gt; 25
&lt;/span&gt;&lt;span class="lnt"&gt; 26
&lt;/span&gt;&lt;span class="lnt"&gt; 27
&lt;/span&gt;&lt;span class="lnt"&gt; 28
&lt;/span&gt;&lt;span class="lnt"&gt; 29
&lt;/span&gt;&lt;span class="lnt"&gt; 30
&lt;/span&gt;&lt;span class="lnt"&gt; 31
&lt;/span&gt;&lt;span class="lnt"&gt; 32
&lt;/span&gt;&lt;span class="lnt"&gt; 33
&lt;/span&gt;&lt;span class="lnt"&gt; 34
&lt;/span&gt;&lt;span class="lnt"&gt; 35
&lt;/span&gt;&lt;span class="lnt"&gt; 36
&lt;/span&gt;&lt;span class="lnt"&gt; 37
&lt;/span&gt;&lt;span class="lnt"&gt; 38
&lt;/span&gt;&lt;span class="lnt"&gt; 39
&lt;/span&gt;&lt;span class="lnt"&gt; 40
&lt;/span&gt;&lt;span class="lnt"&gt; 41
&lt;/span&gt;&lt;span class="lnt"&gt; 42
&lt;/span&gt;&lt;span class="lnt"&gt; 43
&lt;/span&gt;&lt;span class="lnt"&gt; 44
&lt;/span&gt;&lt;span class="lnt"&gt; 45
&lt;/span&gt;&lt;span class="lnt"&gt; 46
&lt;/span&gt;&lt;span class="lnt"&gt; 47
&lt;/span&gt;&lt;span class="lnt"&gt; 48
&lt;/span&gt;&lt;span class="lnt"&gt; 49
&lt;/span&gt;&lt;span class="lnt"&gt; 50
&lt;/span&gt;&lt;span class="lnt"&gt; 51
&lt;/span&gt;&lt;span class="lnt"&gt; 52
&lt;/span&gt;&lt;span class="lnt"&gt; 53
&lt;/span&gt;&lt;span class="lnt"&gt; 54
&lt;/span&gt;&lt;span class="lnt"&gt; 55
&lt;/span&gt;&lt;span class="lnt"&gt; 56
&lt;/span&gt;&lt;span class="lnt"&gt; 57
&lt;/span&gt;&lt;span class="lnt"&gt; 58
&lt;/span&gt;&lt;span class="lnt"&gt; 59
&lt;/span&gt;&lt;span class="lnt"&gt; 60
&lt;/span&gt;&lt;span class="lnt"&gt; 61
&lt;/span&gt;&lt;span class="lnt"&gt; 62
&lt;/span&gt;&lt;span class="lnt"&gt; 63
&lt;/span&gt;&lt;span class="lnt"&gt; 64
&lt;/span&gt;&lt;span class="lnt"&gt; 65
&lt;/span&gt;&lt;span class="lnt"&gt; 66
&lt;/span&gt;&lt;span class="lnt"&gt; 67
&lt;/span&gt;&lt;span class="lnt"&gt; 68
&lt;/span&gt;&lt;span class="lnt"&gt; 69
&lt;/span&gt;&lt;span class="lnt"&gt; 70
&lt;/span&gt;&lt;span class="lnt"&gt; 71
&lt;/span&gt;&lt;span class="lnt"&gt; 72
&lt;/span&gt;&lt;span class="lnt"&gt; 73
&lt;/span&gt;&lt;span class="lnt"&gt; 74
&lt;/span&gt;&lt;span class="lnt"&gt; 75
&lt;/span&gt;&lt;span class="lnt"&gt; 76
&lt;/span&gt;&lt;span class="lnt"&gt; 77
&lt;/span&gt;&lt;span class="lnt"&gt; 78
&lt;/span&gt;&lt;span class="lnt"&gt; 79
&lt;/span&gt;&lt;span class="lnt"&gt; 80
&lt;/span&gt;&lt;span class="lnt"&gt; 81
&lt;/span&gt;&lt;span class="lnt"&gt; 82
&lt;/span&gt;&lt;span class="lnt"&gt; 83
&lt;/span&gt;&lt;span class="lnt"&gt; 84
&lt;/span&gt;&lt;span class="lnt"&gt; 85
&lt;/span&gt;&lt;span class="lnt"&gt; 86
&lt;/span&gt;&lt;span class="lnt"&gt; 87
&lt;/span&gt;&lt;span class="lnt"&gt; 88
&lt;/span&gt;&lt;span class="lnt"&gt; 89
&lt;/span&gt;&lt;span class="lnt"&gt; 90
&lt;/span&gt;&lt;span class="lnt"&gt; 91
&lt;/span&gt;&lt;span class="lnt"&gt; 92
&lt;/span&gt;&lt;span class="lnt"&gt; 93
&lt;/span&gt;&lt;span class="lnt"&gt; 94
&lt;/span&gt;&lt;span class="lnt"&gt; 95
&lt;/span&gt;&lt;span class="lnt"&gt; 96
&lt;/span&gt;&lt;span class="lnt"&gt; 97
&lt;/span&gt;&lt;span class="lnt"&gt; 98
&lt;/span&gt;&lt;span class="lnt"&gt; 99
&lt;/span&gt;&lt;span class="lnt"&gt;100
&lt;/span&gt;&lt;span class="lnt"&gt;101
&lt;/span&gt;&lt;span class="lnt"&gt;102
&lt;/span&gt;&lt;span class="lnt"&gt;103
&lt;/span&gt;&lt;span class="lnt"&gt;104
&lt;/span&gt;&lt;span class="lnt"&gt;105
&lt;/span&gt;&lt;span class="lnt"&gt;106
&lt;/span&gt;&lt;span class="lnt"&gt;107
&lt;/span&gt;&lt;span class="lnt"&gt;108
&lt;/span&gt;&lt;span class="lnt"&gt;109
&lt;/span&gt;&lt;span class="lnt"&gt;110
&lt;/span&gt;&lt;span class="lnt"&gt;111
&lt;/span&gt;&lt;span class="lnt"&gt;112
&lt;/span&gt;&lt;span class="lnt"&gt;113
&lt;/span&gt;&lt;span class="lnt"&gt;114
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-hcl" data-lang="hcl"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;terraform&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; required_version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt; &amp;#34;&amp;gt;&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="err"&gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;required_providers&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; aws&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;hashicorp/aws&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;~&amp;gt; 5.0&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;eu-west-1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Networking ---
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_vpc&amp;#34; &amp;#34;main&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; cidr_block&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;10.0.0.0/16&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; enable_dns_support&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; enable_dns_hostnames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;main-vpc&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_subnet&amp;#34; &amp;#34;public&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; vpc_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; cidr_block&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;10.0.1.0/24&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; availability_zone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;eu-west-1a&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; map_public_ip_on_launch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;public-subnet&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_internet_gateway&amp;#34; &amp;#34;gw&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; vpc_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;main-igw&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_route_table&amp;#34; &amp;#34;public&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; vpc_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;route&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; cidr_block&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;0.0.0.0/0&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; gateway_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_internet_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;gw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;public-rt&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_route_table_association&amp;#34; &amp;#34;public&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; subnet_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_subnet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; route_table_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_route_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Security Group ---
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_security_group&amp;#34; &amp;#34;web&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;web-sg&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;Allow HTTP and SSH&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; vpc_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;ingress&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; from_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; to_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; protocol&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;tcp&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; cidr_blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;0.0.0.0/0&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;ingress&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; from_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;22&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; to_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;22&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; protocol&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;tcp&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; cidr_blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;YOUR_IP/32&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="c1"&gt; # Restrict to your IP
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;egress&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; from_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; to_port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; protocol&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;-1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; cidr_blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;0.0.0.0/0&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- EC2 Instance ---
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_instance&amp;#34; &amp;#34;web&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; ami&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;ami-0c55b159cbfafe1f0&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; instance_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;t3.micro&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; subnet_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_subnet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; vpc_security_group_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;aws_security_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;web&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;web-server&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Outputs ---
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;instance_public_ip&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;web&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;public_ip&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;vpc_id&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;id&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id="the-planapply-workflow"&gt;The plan/apply workflow
&lt;/h2&gt;&lt;p&gt;Terraform follows a predictable workflow:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 1. Initialize - download providers and modules&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;terraform init
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 2. Format - ensure consistent code style&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;terraform fmt
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 3. Validate - check syntax and configuration&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;terraform validate
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 4. Plan - preview what will change (critical step!)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;terraform plan -out&lt;span class="o"&gt;=&lt;/span&gt;tfplan
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 5. Apply - execute the plan&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;terraform apply tfplan
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# 6. Destroy - tear down all resources (when needed)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;terraform destroy
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The &lt;code&gt;terraform plan&lt;/code&gt; step is the most important. Never skip it. Always review the plan output before applying, especially in production. The plan shows you exactly what will be created, modified, or destroyed.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Example plan output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Plan: &lt;span class="m"&gt;6&lt;/span&gt; to add, &lt;span class="m"&gt;0&lt;/span&gt; to change, &lt;span class="m"&gt;0&lt;/span&gt; to destroy.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;In CI/CD pipelines, save the plan to a file (&lt;code&gt;-out=tfplan&lt;/code&gt;) and apply that exact plan. This prevents race conditions where infrastructure changes between the plan and apply steps.&lt;/p&gt;
&lt;h2 id="state-management-best-practices"&gt;State management best practices
&lt;/h2&gt;&lt;p&gt;State management is where most Terraform problems originate. Follow these practices:&lt;/p&gt;
&lt;h3 id="use-a-remote-backend"&gt;Use a remote backend
&lt;/h3&gt;&lt;p&gt;Never store state locally or in Git. Use a remote backend with encryption and locking:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;span class="lnt"&gt;9
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-hcl" data-lang="hcl"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;terraform&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;backend&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;s3&amp;#34;&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;my-terraform-state&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;prod/networking/terraform.tfstate&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;eu-west-1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; encrypt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; dynamodb_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;terraform-locks&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The DynamoDB table provides &lt;strong&gt;state locking&lt;/strong&gt;. This prevents two people or pipelines from modifying the same infrastructure at the same time.&lt;/p&gt;
&lt;h3 id="organize-state-by-component"&gt;Organize state by component
&lt;/h3&gt;&lt;p&gt;Don&amp;rsquo;t put all your infrastructure in one state file. Split by component or team:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;environments&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="n"&gt;prod&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;│&lt;/span&gt; &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="n"&gt;networking&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="c1"&gt;# VPC, subnets, routes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;│&lt;/span&gt; &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="n"&gt;compute&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="c1"&gt;# EC2, ASGs, load balancers&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;│&lt;/span&gt; &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="c1"&gt;# RDS instances&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;│&lt;/span&gt; &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="n"&gt;monitoring&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="c1"&gt;# CloudWatch, alerts&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="n"&gt;networking&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="n"&gt;compute&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Smaller state files mean faster plans, smaller blast radius, and fewer teams competing for locks.&lt;/p&gt;
&lt;h3 id="use-terraform_remote_state-sparingly"&gt;Use &lt;code&gt;terraform_remote_state&lt;/code&gt; sparingly
&lt;/h3&gt;&lt;p&gt;You can reference outputs from other state files, but use it carefully. Over-reliance on remote state creates tight coupling between components. Prefer passing values through variables or a parameter store.&lt;/p&gt;
&lt;h2 id="tips-for-production-use"&gt;Tips for production use
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Pin provider versions.&lt;/strong&gt; Use &lt;code&gt;~&amp;gt;&lt;/code&gt; constraints to allow patch updates but prevent breaking changes: &lt;code&gt;version = &amp;quot;~&amp;gt; 5.0&amp;quot;&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use workspaces carefully.&lt;/strong&gt; Workspaces are useful for simple environment separation but get confusing at scale. Separate directories per environment is usually clearer.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Implement a CI/CD pipeline for Terraform.&lt;/strong&gt; Run &lt;code&gt;terraform plan&lt;/code&gt; on PRs and post the output as a PR comment. Run &lt;code&gt;terraform apply&lt;/code&gt; only after merge and approval.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;prevent_destroy&lt;/code&gt; for critical resources.&lt;/strong&gt; This lifecycle rule stops accidental destruction of databases or persistent storage:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-hcl" data-lang="hcl"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;aws_db_instance&amp;#34; &amp;#34;main&amp;#34;&lt;/span&gt; {&lt;span class="c1"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt; # ...
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;lifecycle&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt; prevent_destroy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tag everything.&lt;/strong&gt; Use a &lt;code&gt;default_tags&lt;/code&gt; block in the provider to ensure every resource gets standard tags (environment, team, project).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;tflint&lt;/code&gt; and &lt;code&gt;checkov&lt;/code&gt;.&lt;/strong&gt; Lint your Terraform code and scan for security misconfigurations before applying.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;tflint --init
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;tflint
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;checkov -d .
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Import existing resources.&lt;/strong&gt; If you have manually created infrastructure, use &lt;code&gt;terraform import&lt;/code&gt; to bring it under management instead of recreating it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Review the plan diff carefully.&lt;/strong&gt; A resource showing &amp;ldquo;destroy and recreate&amp;rdquo; might cause downtime. Understand which changes are in-place versus destructive.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Terraform is one of those tools that rewards discipline. The more consistently you follow these practices, the more confidently your team manages infrastructure at scale.&lt;/p&gt;</description></item></channel></rss>