Architecture: AWS Fargate

You have a containerized service. You want it running in AWS. You create an ECS cluster, write a task definition, and hit deploy. A minute later, your container is up and serving traffic. You never touched a server.

That convenience hides a remarkable amount of engineering. Somewhere in AWS, a scheduler decided where your task would run, a microVM was created and booted in milliseconds, a network interface was injected into your VPC, and your container image was pulled and started inside an isolated environment that shares no kernel with any other customer's workload. None of that is visible to you. All of it is Fargate.

This article opens the hood. By the end, you will understand exactly what happens between submitting a task and your container receiving its first request, why Fargate can make strong security isolation guarantees, and where the boundaries of the system sit.

The Big Picture

Fargate is not an orchestrator. It is a compute provider. The orchestration layer — deciding what runs, how many copies, which container image, what environment variables — stays with ECS or EKS. Fargate's job is to answer one question: given a resource request (CPU, memory) and a container specification, provide an isolated environment to run it, with no customer-managed infrastructure underneath.

Before Fargate, running containers on ECS meant provisioning and managing a fleet of EC2 instances as your cluster's capacity. You chose instance types, patched operating systems, scaled the fleet ahead of demand, and paid for idle capacity. Fargate moves that responsibility entirely to AWS. You describe what you want to run. AWS figures out where.

Fargate integrates with two orchestrators:

Amazon ECS — AWS's native container orchestrator. Tasks are the unit of deployment. Services manage long-running tasks with load balancing and desired-count maintenance.
Amazon EKS — Managed Kubernetes. Fargate Profiles map Kubernetes pods to Fargate compute based on namespace and label selectors.

In both cases, Fargate sits below the orchestration layer as an invisible fleet of compute capacity. The orchestrator thinks in tasks or pods. Fargate thinks in microVMs.

The key architectural insight is that Fargate tasks are not processes on a shared host. They are lightweight virtual machines. That distinction drives most of Fargate's security and networking design.

The Control Plane

When you submit a task to ECS or a pod to EKS with a Fargate profile, the request moves through the orchestrator's control plane before Fargate sees it.

In ECS, the control plane accepts the `RunTask` API call, validates the task definition, checks capacity provider configuration, and places the task. When the capacity provider is `FARGATE` or `FARGATE_SPOT`, ECS sends a placement request to the Fargate control plane rather than selecting an EC2 instance from a cluster.

In EKS with Fargate Profiles, a Fargate profile defines a set of rules: a namespace, optional pod selectors. When the Kubernetes scheduler sees a pod that matches a profile, it marks the pod as schedulable on Fargate. The EKS Fargate admission controller intercepts the pod spec and routes it to the Fargate compute provider. From Kubernetes' perspective, each Fargate pod appears to run on its own dedicated node; this is not an abstraction leak, it is literally true at the VM level.

The Fargate control plane receives the placement request and does several things in parallel: it selects a physical host from AWS's fleet, creates the microVM, attaches networking, and returns a handle to the orchestrator. The orchestrator then tracks the task or pod lifecycle normally. The underlying host is never exposed.

Task Isolation: Firecracker MicroVMs

This is the engineering decision that defines Fargate's security model. Every Fargate task runs inside a Firecracker microVM: a lightweight virtual machine monitor developed by AWS and open-sourced in 2018.

Firecracker was built specifically for serverless workloads. Traditional hypervisors (KVM, Xen) carry significant overhead: full device emulation, large memory footprints, slow boot times. Firecracker strips the VM down to the minimum required to run a Linux guest: a minimal device model, a KVM-based VMM written in Rust, and nothing else. A Firecracker microVM can boot a kernel and reach userspace in well under 200 milliseconds. Memory overhead per VM is measured in megabytes, not gigabytes.

Each Fargate task gets exactly one microVM. The containers in your task definition run inside that VM as regular processes. From the container's perspective, it is on a normal Linux host. From AWS's perspective, it is in a hard VM boundary that shares no kernel, no memory, and no process namespace with any other customer's task, regardless of what physical host they happen to share.

This is the critical difference from traditional container runtimes. A container on a shared EC2 host shares the kernel with every other container on that host. A kernel vulnerability can potentially escape the container. A Fargate task cannot escape its microVM without exploiting the hypervisor itself, which is an orders-of-magnitude harder attack surface.

Inside each microVM, AWS runs a Fargate agent: a small process responsible for pulling container images, starting the container runtime, reporting task status back to the ECS or EKS control plane, and streaming logs. The agent is part of the AWS-managed layer; you do not interact with it directly, but it is what makes the task visible to your orchestrator.

Networking: ENI Injection and VPC Routing

Fargate uses a networking model called VPC-native networking, and it is one of the more architecturally interesting parts of the system.

Every Fargate task gets its own Elastic Network Interface (ENI) injected directly into your VPC. This is not a shared ENI with port mapping. It is a dedicated network interface with its own private IP address drawn from your subnet's CIDR range. The task is a first-class citizen of your VPC, indistinguishable at the network layer from an EC2 instance.

This has direct implications for security groups. You apply security groups to the ENI, not to a host. Traffic rules apply at the task boundary, not the instance boundary. Two tasks in the same service can have different security group configurations if needed, though in practice they typically share one.

For outbound internet access, a Fargate task in a private subnet routes through a NAT Gateway exactly as an EC2 instance would. For inbound traffic from a load balancer, the ALB or NLB target group registers the task's ENI IP directly. This is IP-mode target registration: the load balancer sends traffic straight to the task's private IP, bypassing any host-level port mapping layer entirely.

There is one important caveat to the ENI-per-task model: ENI limits. AWS accounts have a default limit on the number of ENIs per subnet and per region. In high-scale deployments, running hundreds of Fargate tasks in a single small subnet can exhaust the available IP space or ENI quota. Subnet sizing and IP address planning matter more with Fargate than with traditional EC2 deployments where many containers share one host interface.

IAM and Identity

Fargate tasks interact with two distinct IAM roles, and conflating them is a common source of confusion.

The task execution role is assumed by the Fargate agent, not your application. It grants the agent permission to do its setup work: pulling images from ECR, fetching secrets from Secrets Manager or Parameter Store to inject as environment variables, and writing logs to CloudWatch. If your container image is private or your task definition references secrets, this role needs the right permissions. Your application never uses it.

The task role is assumed by your application code. It is the identity your containers use to call AWS APIs at runtime: reading from S3, writing to DynamoDB, publishing to SNS. AWS injects credentials for this role into the task via a task metadata endpoint inside the microVM. The SDK in your container picks these up automatically via the standard credential provider chain.

The metadata endpoint (`169.254.170.2`) is accessible only from within the task's microVM. An application in a different task, even on the same physical host, cannot reach it. This is another place where VM-level isolation provides a concrete security benefit over shared-host container runtimes.

Storage

Fargate's storage model is deliberately constrained, which reflects its design philosophy: stateless compute.

Every task gets ephemeral storage: a local filesystem backed by the microVM's virtual disk. The default is 20 GB; you can configure up to 200 GB in the task definition. This storage is destroyed when the task stops. It is suitable for temporary files, build artifacts, and cache during a single task execution. Nothing on ephemeral storage survives a task restart.

For persistent storage, Fargate integrates with Amazon EFS. You mount an EFS file system as a volume in your task definition, and the Fargate agent handles the mount inside the microVM using the EFS mount helper. Multiple tasks across multiple availability zones can mount the same EFS file system simultaneously, which is the correct pattern for shared configuration files, ML model weights, or any state that needs to outlive individual tasks.

Fargate does not support EBS volumes. EBS is a block device tied to an availability zone; its lifecycle model does not fit the transient, placement-agnostic nature of Fargate tasks. If your workload needs EBS, it needs EC2.

Fargate Spot

Fargate Spot is the interruption-tolerant capacity tier. AWS runs Fargate on a large fleet of physical hosts. When spare capacity is available, it can be offered at a significant discount (typically 50–70% below standard Fargate pricing). When AWS needs that capacity back, it sends a two-minute termination notice to the Fargate agent, which propagates a `SIGTERM` to your containers.

The interruption model is the same as EC2 Spot at the concept level, but the operational surface is simpler. You do not manage instance types or launch templates. You set a capacity provider strategy in ECS that mixes `FARGATE` and `FARGATE_SPOT` with a weight and base. A common pattern is to set `FARGATE` base to 1 (at least one standard task always running) and weight `FARGATE_SPOT` higher so additional capacity scales onto Spot.

Fargate Spot is appropriate for stateless, horizontally-scaled services where losing a task is a non-event: the load balancer routes traffic to surviving tasks and ECS replaces the interrupted task, potentially on standard capacity if Spot is unavailable. It is not appropriate for tasks that hold in-memory state, maintain long-lived connections, or perform work that cannot be safely interrupted and restarted.

Task Launch Lifecycle

Tracing a task from API call to container start illustrates how the components above connect in practice.

The dominant factor in launch latency is image pull time. If your image is large and uncached, most of the 30–90 second window is spent transferring layers from ECR. Fargate does cache image layers on the underlying host infrastructure, so subsequent task launches with the same image version are faster. Keeping images small and using ECR in the same region as your tasks are the two highest-leverage optimizations for startup latency.

Failure Modes and Fault Tolerance

Fargate shifts operational responsibility to AWS but does not eliminate failure modes. Understanding where failures can occur helps you design resilient services on top of Fargate rather than assuming the platform is unconditionally reliable.

Task-level failures are the most common. Your application crashes, hits an OOM condition, or fails its health check. ECS service scheduler and Kubernetes will detect the failure and launch a replacement task. The replacement goes through the full launch lifecycle, so there is a gap between the failed task stopping and the replacement being healthy. Running a minimum of two tasks behind a load balancer ensures that a single task failure does not result in downtime.

Placement failures occur when Fargate cannot provision capacity. This can happen during regional capacity events, when requesting very large task sizes, or when Fargate Spot capacity is unavailable in the requested AZ. ECS reports a `RESOURCE:FARGATE` placement failure reason. Distributing tasks across multiple availability zones and mixing Fargate Spot with standard Fargate capacity reduces the likelihood of a placement failure taking down all capacity simultaneously.

ENI exhaustion is a subnet-level failure. If your subnet runs out of available IP addresses, new task ENIs cannot be allocated and task launches fail. Monitoring available IP addresses in subnets used by Fargate is an operational concern that is easy to overlook until it causes an incident at scale.

Control plane availability is an AWS responsibility. The ECS and EKS control planes are multi-AZ services with high availability SLAs. Running tasks are not affected by transient control plane issues; the Fargate agent on running tasks continues operating independently. New deployments and scaling operations will queue or fail during an outage, but existing tasks keep running.

Fargate Spot interruptions require application-level handling. When the two-minute warning arrives, your containers receive `SIGTERM`. If your application does not handle `SIGTERM` gracefully (draining in-flight requests, closing database connections), it will be forcibly terminated after the grace period. Proper signal handling and a connection draining period on your load balancer target group are essential for Spot workloads.

Full Architecture Diagram

Summary

Fargate's core engineering decision is the microVM. By running every task in a Firecracker VM rather than as a process on a shared host, AWS can make a security isolation guarantee that would be impossible with a traditional shared-kernel container runtime. That decision ripples outward: the networking model (a dedicated ENI per task rather than shared host networking), the IAM model (credentials available only from within the VM), and the storage model (ephemeral by default, EFS for persistence) all follow from the same principle of hard per-task boundaries.

The tradeoff is startup latency and density. microVMs take longer to boot than containers on a warm host, and ENI-per-task means you are constrained by subnet IP space at scale. These are known and accepted costs for the operational simplicity and security posture Fargate provides.

For ECS users, Fargate is a capacity provider. For EKS users, it is a node provider. In both cases, the compute is invisible until something goes wrong. Understanding the layer below the orchestrator gives you the tools to diagnose failures, plan for scale, and make deliberate choices about when Fargate is the right fit and when EC2 gives you capabilities you actually need.

Related on this blog: Architecture series