Job description template
Cloud Engineer Job Description Template (2026)
A free, copy-ready Cloud Engineer job description covering responsibilities, must-have skills, tools, seniority variants, and KPIs. Written for hiring managers, not for SEO filler.
Key facts
- Role
- Cloud Engineer
- Reports to
- Reports to the Head of Platform
- Must-have skills
- 8 items
- Seniority tiers
- Junior / Mid / Senior
- KPIs defined
- 6 metrics
- Starting price (offshore)
- $3400/month
Role summary
A Cloud Engineer owns the cloud platform: designing multi-account AWS, Azure, or GCP architectures, writing Terraform modules consumed by product teams, hardening IAM and networking, running FinOps to keep the bill honest, and mapping controls to SOC 2, HIPAA, or PCI scope. Focused on cloud services, landing zones, and platform economics rather than CI/CD pipelines or deploy cadence.
Responsibilities
- • Design and evolve the landing zone: multi-account AWS Organizations / Azure management groups / GCP organization hierarchy with environment and workload separation.
- • Write Terraform 1.5+ or Pulumi modules with remote state (S3 + DynamoDB lock, Terraform Cloud, or GCS), workspace strategy, and a private module registry.
- • Architect VPC/VNet topologies including subnets, route tables, NAT, Transit Gateway / VPC peering / Private Service Connect, and egress control via firewall or PrivateLink.
- • Design least-privilege IAM: roles, SCPs, permission boundaries, Azure PIM, GCP IAM conditions, and eliminate long-lived access keys via OIDC federation.
- • Operate managed Kubernetes (EKS 1.28+, GKE, AKS) at the platform layer: cluster autoscaler, Karpenter, IRSA/Workload Identity, cluster upgrades.
- • Run Well-Architected or equivalent reviews against operational excellence, security, reliability, performance efficiency, and cost optimization pillars.
- • Build FinOps reporting: tagging policies, Cost Explorer dashboards, anomaly detection, commitment discount (SP/RI/CUD) modeling, chargeback/showback to product teams.
- • Target 99.9%+ availability on critical workloads through multi-AZ deployment by default and multi-region failover for tier-1 services.
- • Define RTO and RPO per service, design backup strategies (AWS Backup, Azure Backup, GCP Backup & DR) that meet them, and run quarterly restore tests.
- • Map cloud controls to SOC 2 CC, HIPAA Security Rule, and PCI DSS requirements; wire up AWS Config, Azure Policy, or GCP Organization Policy for continuous compliance.
- • Manage secrets infrastructure: Vault, AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager — with rotation, audit, and zero hardcoded credentials.
- • Partner with DevOps/platform engineers on the boundary between cloud platform (owned here) and deployment tooling (owned there).
Must-have skills
- • 4+ years hands-on with at least one major cloud (AWS, Azure, or GCP) in a production account you owned.
- • Strong Terraform 1.5+ or Pulumi — writing reusable modules, managing remote state, handling drift and refactor (moved blocks, import blocks).
- • Deep IAM design experience: roles, policies, boundaries, federation (OIDC/SAML), cross-account access patterns.
- • VPC/VNet networking fluency: subnets, route tables, NAT, peering, Transit Gateway or equivalent, security groups/NSGs, PrivateLink/Private Endpoints.
- • Managed Kubernetes operation: EKS/GKE/AKS cluster lifecycle, node group / node pool strategy, IRSA or Workload Identity.
- • Cost analysis fluency with Cost Explorer, Kubecost, or Cloudability — has shipped a measurable cost reduction.
- • At least one compliance framework done in anger (SOC 2, HIPAA, PCI, ISO 27001).
- • Strong written English for architecture docs and async review.
Nice-to-have skills
- • Secondary cloud proficiency (hires on AWS with real Azure or GCP experience are rare and valuable).
- • Service Catalog, Proton, or internal developer platform (IDP) experience — Backstage, Humanitec, or similar.
- • AWS Solutions Architect Professional, GCP Professional Cloud Architect, or Azure Solutions Architect Expert certification.
- • Data platform experience: Snowflake, BigQuery, Redshift, or Databricks networking and IAM integration.
- • Experience supporting FedRAMP or GovCloud workloads.
Tools and technology
- AWS / Azure / GCP
- Terraform 1.5+ / Pulumi
- AWS Organizations / Control Tower
- Kubernetes 1.28+ (EKS/GKE/AKS)
- HashiCorp Vault / Secrets Manager
- AWS Config / Azure Policy / GCP Org Policy
- CloudWatch / Azure Monitor / GCP Operations
- Cost Explorer / Kubecost / Cloudability
- AWS SSO / Okta / Azure AD
- Checkov / tfsec / Prowler
Reporting structure
Reports to the Head of Platform, Head of Infrastructure, or CTO. Partners with DevOps on the pipeline / platform boundary, with Security on IAM and compliance evidence, with Finance on FinOps reporting and budgets, and with product engineering teams consuming platform modules.
Seniority variants
How responsibilities shift across junior, mid, and senior levels.
junior
1-3 years
- • Implement scoped Terraform changes under senior review against existing module patterns.
- • Triage cost and security findings from automated tools (Prowler, Checkov, AWS Trusted Advisor).
- • Maintain tagging hygiene and fix drift reports.
- • Document runbooks and participate in restore tests.
mid
3-5 years
- • Own a workload area end to end: networking, IAM, Terraform modules, and cost.
- • Lead Well-Architected reviews on specific pillars with written recommendations.
- • Drive a cost optimization project with measured savings target.
- • Partner with compliance on evidence collection for an audit cycle.
senior
6+ years
- • Set the landing zone strategy: account vending, SCPs, network topology, identity federation.
- • Lead multi-region or multi-cloud architecture decisions with explicit trade-off analysis.
- • Own the FinOps program and present to leadership on spend and commitment strategy.
- • Mentor mid/junior engineers and represent cloud platform in cross-org architecture reviews.
Success metrics (KPIs)
- • Tier-1 workload availability at or above 99.9% (43 minutes of downtime per month max).
- • Cloud spend variance to budget within ±5% month over month with explained anomalies.
- • Zero long-lived IAM access keys in production; 100% of workloads on OIDC/Workload Identity.
- • Terraform drift rate under 5% of resources; drift resolved within 48 hours of detection.
- • Audit findings (SOC 2/HIPAA/PCI) closed within the auditor-agreed window — zero critical findings unaddressed.
- • Restore test cadence: every critical service restore-tested at least quarterly with written evidence.
Full JD (copy-ready)
Paste this into your ATS or careers page. Edit the company name and any bracketed placeholders.
# Cloud Engineer — Job Description ## Role summary A Cloud Engineer owns the cloud platform: designing multi-account AWS, Azure, or GCP architectures, writing Terraform modules consumed by product teams, hardening IAM and networking, running FinOps to keep the bill honest, and mapping controls to SOC 2, HIPAA, or PCI scope. Focused on cloud services, landing zones, and platform economics rather than CI/CD pipelines or deploy cadence. ## Responsibilities - Design and evolve the landing zone: multi-account AWS Organizations / Azure management groups / GCP organization hierarchy with environment and workload separation. - Write Terraform 1.5+ or Pulumi modules with remote state (S3 + DynamoDB lock, Terraform Cloud, or GCS), workspace strategy, and a private module registry. - Architect VPC/VNet topologies including subnets, route tables, NAT, Transit Gateway / VPC peering / Private Service Connect, and egress control via firewall or PrivateLink. - Design least-privilege IAM: roles, SCPs, permission boundaries, Azure PIM, GCP IAM conditions, and eliminate long-lived access keys via OIDC federation. - Operate managed Kubernetes (EKS 1.28+, GKE, AKS) at the platform layer: cluster autoscaler, Karpenter, IRSA/Workload Identity, cluster upgrades. - Run Well-Architected or equivalent reviews against operational excellence, security, reliability, performance efficiency, and cost optimization pillars. - Build FinOps reporting: tagging policies, Cost Explorer dashboards, anomaly detection, commitment discount (SP/RI/CUD) modeling, chargeback/showback to product teams. - Target 99.9%+ availability on critical workloads through multi-AZ deployment by default and multi-region failover for tier-1 services. - Define RTO and RPO per service, design backup strategies (AWS Backup, Azure Backup, GCP Backup & DR) that meet them, and run quarterly restore tests. - Map cloud controls to SOC 2 CC, HIPAA Security Rule, and PCI DSS requirements; wire up AWS Config, Azure Policy, or GCP Organization Policy for continuous compliance. - Manage secrets infrastructure: Vault, AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager — with rotation, audit, and zero hardcoded credentials. - Partner with DevOps/platform engineers on the boundary between cloud platform (owned here) and deployment tooling (owned there). ## Must-have skills - 4+ years hands-on with at least one major cloud (AWS, Azure, or GCP) in a production account you owned. - Strong Terraform 1.5+ or Pulumi — writing reusable modules, managing remote state, handling drift and refactor (moved blocks, import blocks). - Deep IAM design experience: roles, policies, boundaries, federation (OIDC/SAML), cross-account access patterns. - VPC/VNet networking fluency: subnets, route tables, NAT, peering, Transit Gateway or equivalent, security groups/NSGs, PrivateLink/Private Endpoints. - Managed Kubernetes operation: EKS/GKE/AKS cluster lifecycle, node group / node pool strategy, IRSA or Workload Identity. - Cost analysis fluency with Cost Explorer, Kubecost, or Cloudability — has shipped a measurable cost reduction. - At least one compliance framework done in anger (SOC 2, HIPAA, PCI, ISO 27001). - Strong written English for architecture docs and async review. ## Nice-to-have skills - Secondary cloud proficiency (hires on AWS with real Azure or GCP experience are rare and valuable). - Service Catalog, Proton, or internal developer platform (IDP) experience — Backstage, Humanitec, or similar. - AWS Solutions Architect Professional, GCP Professional Cloud Architect, or Azure Solutions Architect Expert certification. - Data platform experience: Snowflake, BigQuery, Redshift, or Databricks networking and IAM integration. - Experience supporting FedRAMP or GovCloud workloads. ## Tools and technology - AWS / Azure / GCP - Terraform 1.5+ / Pulumi - AWS Organizations / Control Tower - Kubernetes 1.28+ (EKS/GKE/AKS) - HashiCorp Vault / Secrets Manager - AWS Config / Azure Policy / GCP Org Policy - CloudWatch / Azure Monitor / GCP Operations - Cost Explorer / Kubecost / Cloudability - AWS SSO / Okta / Azure AD - Checkov / tfsec / Prowler ## Reporting structure Reports to the Head of Platform, Head of Infrastructure, or CTO. Partners with DevOps on the pipeline / platform boundary, with Security on IAM and compliance evidence, with Finance on FinOps reporting and budgets, and with product engineering teams consuming platform modules. ## Success metrics (KPIs) - Tier-1 workload availability at or above 99.9% (43 minutes of downtime per month max). - Cloud spend variance to budget within ±5% month over month with explained anomalies. - Zero long-lived IAM access keys in production; 100% of workloads on OIDC/Workload Identity. - Terraform drift rate under 5% of resources; drift resolved within 48 hours of detection. - Audit findings (SOC 2/HIPAA/PCI) closed within the auditor-agreed window — zero critical findings unaddressed. - Restore test cadence: every critical service restore-tested at least quarterly with written evidence.
Frequently asked questions
What does a Cloud Engineer do day-to-day?
A Cloud Engineer owns the cloud platform: designing multi-account AWS, Azure, or GCP architectures, writing Terraform modules consumed by product teams, hardening IAM and networking, running FinOps to keep the bill honest, and mapping controls to SOC 2, HIPAA, or PCI scope. Focused on cloud services, landing zones, and platform economics rather than CI/CD pipelines or deploy cadence.
How many years of experience should a mid-level Cloud Engineer have?
A mid-level Cloud Engineer typically has 3-5 years of experience. At that level they should own a workload area end to end: networking, iam, terraform modules, and cost.
Which KPIs should I hold a Cloud Engineer accountable to?
The most important KPIs for a Cloud Engineer are: Tier-1 workload availability at or above 99.9% (43 minutes of downtime per month max).; Cloud spend variance to budget within ±5% month over month with explained anomalies.; Zero long-lived IAM access keys in production; 100% of workloads on OIDC/Workload Identity.; Terraform drift rate under 5% of resources; drift resolved within 48 hours of detection..
Multi-cloud or single cloud — which do you recommend?
Single cloud for almost everyone. Multi-cloud sounds like resilience but in practice it doubles operational cost, cuts your leverage on volume discounts, slows down your engineers because nobody knows both well, and rarely delivers the portability promise. Real multi-cloud makes sense when a specific customer contract demands it, when you need a service that only one provider offers, or when regulatory rules require data residency in a region the primary cloud does not serve. Your cloud engineer will ask which of those applies before writing Terraform for a second provider.
How do they approach FinOps and cloud cost cuts?
Measure first, cut second, automate third. Standard approach is two weeks of baseline data through Cost Explorer, Cloudability, or Kubecost to see where the money actually goes, then target the top three line items. Typical savings come from right-sizing oversized compute, reserved or savings plans on steady-state workloads, S3 lifecycle rules, autoscaling on spiky workloads, killing abandoned resources, and reducing cross-AZ or cross-region egress. A senior cloud engineer will often find 25 to 40 percent of the bill is waste in their first month, without touching production capacity.
Related
Written by Syed Ali
Founder, Remoteria
Syed Ali founded Remoteria after a decade building distributed teams across 4 continents. He has helped 500+ companies source, vet, onboard, and scale pre-vetted offshore talent in engineering, design, marketing, and operations.
- • 10+ years building distributed remote teams
- • 500+ successful offshore placements across US, UK, EU, and APAC
- • Specialist in offshore vetting and cross-timezone team integration
Last updated: April 12, 2026