Architecture: How EC2 Works

You spin up an instance, SSH in, and it just works. You get a Linux prompt, a fixed amount of memory, network access, and a disk. Where that instance actually runs, what it shares with other instances, how your keystrokes travel from your laptop to that shell — all of it is invisible. That invisibility is the product. But understanding what is underneath it changes how you design for failure, choose instance types, tune networking, and debug the subtle problems that only appear under load.

This post covers the full internal architecture of EC2: the Nitro System that underpins every modern instance, the hypervisor layer, how instances are scheduled and launched, how networking and storage are wired together, and what happens when things go wrong. General AWS familiarity is assumed; deep AWS experience is not required.

The big picture

EC2 is, at its core, a virtualisation platform. You request compute, and AWS carves virtual machines out of physical host servers in its data centers. For the first decade of EC2's existence, this was done with a modified version of the open-source Xen hypervisor. The host CPU ran Xen, and a privileged virtual machine called Dom0 handled I/O on behalf of all the other guest VMs.

Dom0 was a bottleneck. It consumed a significant share of the host's CPU and memory just to exist. Every network packet and every disk I/O call passed through it. By the early 2010s this design was limiting how much of a host's resources AWS could actually sell to customers.

The answer was the Nitro System: a purpose-built set of hardware cards and a lightweight hypervisor that offloads all I/O work off the host CPU entirely. Nitro is not a product customers interact with directly. It is the infrastructure layer that every modern EC2 instance type runs on top of. Understanding Nitro is the key to understanding EC2.

The Nitro System

Nitro is the collective name for the hardware cards, firmware, and lightweight hypervisor that AWS built to replace the Dom0 architecture. It has three distinct parts, each solving a different problem.

Nitro Cards

Nitro Cards are custom ASICs (application-specific integrated circuits) that plug into the host server's PCIe bus. There are two main variants. The Nitro VPC Card handles all network I/O: it implements the full VPC networking stack in hardware, including encapsulation, routing, and security group enforcement. The Nitro EBS Card handles all storage I/O: it manages the NVMe-over-Fabrics connection to EBS volumes and the encryption of data in transit to storage.

The critical implication: because these cards handle I/O independently, the host CPU is freed entirely for the guest. When AWS says a `c7g` instance gives you 100% of host CPU, that's only possible because no CPU cycles are spent on network or storage processing. There is no software I/O path running on the main processor.

The Nitro Security Chip and Controller

The Nitro Security Chip is a hardware root of trust embedded on the host motherboard. It controls and attests the firmware of every component on the host at boot time. If anything in the boot chain has been tampered with, the host will not come online. This is how AWS makes the "no AWS operator access" guarantee: the security chip enforces it at the hardware level, not just in policy.

The Nitro Controller is a microcontroller that manages the host's lifecycle: provisioning, monitoring, and coordinating with the regional EC2 control plane. When you call `RunInstances`, the controller is what receives the instruction and coordinates the boot sequence.

The Nitro Hypervisor

For virtualised instance types (which is most of them), the Nitro Hypervisor provides CPU and memory virtualisation. It is built on a heavily stripped-down version of KVM (Kernel-based Virtual Machine) with only the features needed: vCPU scheduling, memory isolation, and a thin emulation layer for the guest's boot devices. It runs directly on the host hardware with no general-purpose OS underneath it.

Because Nitro Cards handle all I/O, the hypervisor itself is tiny. Its attack surface is dramatically smaller than Xen + Dom0. It consumes negligible CPU and memory. The practical result: on a host with 192 vCPUs available, customers can purchase instances that add up to exactly 192 vCPUs, with nothing lost to hypervisor overhead.

Virtualisation and instance types

Modern EC2 instances run under the HVM (Hardware Virtual Machine) virtualisation mode. Under HVM, the guest OS boots as if it were running on real hardware. The hypervisor intercepts privileged instructions transparently using Intel VT-x or AMD-V CPU extensions. The guest OS requires no modification; a standard Linux kernel or Windows Server image works without change.

The older PV (paravirtual) mode required a modified guest kernel that was aware it was running in a VM. PV is effectively deprecated; no current-generation instance types use it.

For bare-metal instances (the `*.metal` types), there is no hypervisor at all. The guest OS runs directly on the physical host's CPUs. The Nitro Cards are still present, so networking and storage work identically to virtualised instances. Bare metal is chosen when you need to run your own hypervisor (VMware on AWS), need access to specific hardware features not exposed through virtualisation, or have licensing models tied to physical CPU sockets.

Instance types encode a significant amount of information about the underlying hardware. The family letter (`c` for compute-optimised, `m` for general purpose, `r` for memory-optimised, `g` for GPU, `i` for I/O-optimised) tells you the workload the host is tuned for. The generation number (`7` in `c7g`) tells you which generation of Nitro and underlying hardware. The suffix (`g` for Graviton, `n` for enhanced networking, `d` for local NVMe) tells you specific hardware features present on that host class.

Instance lifecycle and the scheduler

Behind every `RunInstances` API call is a regional control plane that has to find a suitable physical host, configure it, and boot your instance. This process involves several interacting systems.

The EC2 control plane

The EC2 control plane is a regional service. It maintains a database of every physical host in the region: their capacity, health, current allocations, Availability Zone membership, and placement group membership. When a launch request arrives, the control plane runs a placement algorithm to find a host that satisfies all constraints.

Those constraints include: the requested instance type (which maps to a specific host hardware class), the requested Availability Zone, any explicit placement group or Dedicated Host, the current free capacity on candidate hosts, and internal health and maintenance state. This is not a simple first-fit algorithm; AWS has filed patents on its bin-packing approach, which attempts to optimise across the fleet for both capacity utilisation and blast radius isolation.

The launch sequence

Once a host is selected, the Nitro Controller on that host receives the launch instruction. It configures the Nitro Cards with the instance's VPC and EBS parameters, sets up the memory allocation and vCPU assignment in the hypervisor, and boots the instance from the specified AMI. The entire sequence from API call to a running instance is typically under 60 seconds for a pre-warmed AMI, though it can be longer for large instances or when the regional control plane is under load.

Pending, running, stopping, stopped

The instance state machine has more nuance than the console surface suggests. Pending means the host has been selected and the boot sequence is in progress; billing does not start yet. Running means the instance has passed both AWS system status checks and the instance-level status check. Stopping is a soft transition that gracefully shuts the guest OS down and releases vCPU allocation; EBS volumes remain attached. Stopped instances retain their instance ID, attached EBS volumes, and Elastic IP associations, but the host reservation is released: AWS may move the instance to a different physical host when you start it again.

This is a significant operational point. A stopped-then-restarted instance will often have a different underlying host, a different public IP (unless you use an Elastic IP), and potentially a different physical Availability Zone rack. Do not assume host affinity persists across stop/start cycles.

Spot instances and interruption

Spot Instances are the same physical instances running the same Nitro hardware, but priced against spare capacity in the fleet. The EC2 control plane can reclaim a Spot Instance with a two-minute warning when it needs the capacity back. From a hardware architecture standpoint there is nothing different about a Spot Instance; it is an allocation policy, not a different hardware class.

Networking: VPC, ENI, and ENA

Every EC2 instance is attached to a VPC (Virtual Private Cloud). The VPC is a logically isolated network that you define: its CIDR range, subnets, route tables, internet gateways, and NAT gateways. The physical implementation of VPC networking is handled entirely by the Nitro VPC Card on the host.

Elastic Network Interfaces

An instance's connection to its VPC is represented by an ENI (Elastic Network Interface). An ENI is a virtual network card. It carries one or more private IP addresses from the subnet CIDR, an optional public IP or Elastic IP, one or more security groups, and a MAC address. The Nitro VPC Card maintains the full state of the ENI in hardware.

Most instance types support multiple ENIs. This is used for network segmentation (management traffic on one ENI, application traffic on another), for licensing tied to MAC address, and for deploying network appliances that must bridge multiple VPCs or subnets. When you attach or detach an ENI, you are reconfiguring the Nitro Card; the guest OS sees a new virtual NIC appear or disappear.

Elastic Network Adapter

The ENA (Elastic Network Adapter) is the virtual NIC driver that the guest OS uses to talk to the Nitro VPC Card. It is designed for high throughput and low latency. On current-generation instances, ENA supports up to 100 Gbps of network bandwidth, and does so through a queue-based design where the guest directly writes to ring buffers that the Nitro Card DMA-reads, bypassing the kernel networking stack for the data path. The guest's TCP/IP stack still runs normally; ENA replaces only the hardware interface layer.

Security groups as hardware enforcement

Security groups are not a software firewall running on the instance. They are rules enforced by the Nitro VPC Card before any packet reaches the guest. A packet destined for your instance is evaluated against the security group rules in hardware on the host. If the rule does not permit it, the packet never reaches the guest OS, regardless of what firewall rules (or lack thereof) exist inside the instance.

This distinction matters for two reasons. First, security groups are stateful at the hardware level: if you permit inbound port 80, the return traffic is automatically allowed without a corresponding outbound rule. Second, you cannot accidentally bypass a security group from inside the instance; there is no guest-accessible path around the Nitro Card.

Placement groups

AWS exposes three placement group strategies that influence how the control plane schedules instances onto physical hosts.

A cluster placement group asks the scheduler to place all instances onto hosts that are physically close to each other, typically in the same rack or row of racks. This minimises network latency between instances and is the configuration required to achieve the highest bandwidth between instances, including the 100 Gbps figures AWS publishes. Use this for tightly coupled HPC workloads, Kafka clusters, or anything where inter-node latency is in the critical path.

A spread placement group is the opposite: instances are placed on distinct underlying hardware, specifically on distinct racks with separate power and network. The maximum is seven instances per AZ per group. Use this for small critical workloads where you need to guarantee that a single hardware failure cannot take out more than one instance simultaneously.

A partition placement group divides instances into logical partitions, where each partition is guaranteed to be on separate physical hardware from the others, but instances within a partition may share hardware. You can have up to seven partitions per AZ, with no limit on instances per partition. This is designed for distributed systems (HDFS, Cassandra, HBase) that are already rack-aware and handle their own replication; the partition metadata is exposed to the instances so the application can make placement-aware decisions.

Storage: EBS, instance store, and the NVMe path

EC2 instances have two fundamentally different storage options. They use different hardware paths, have different durability characteristics, and suit different workloads. Understanding the difference is essential for avoiding data loss and for selecting instance types correctly.

EBS: network-attached block storage

EBS (Elastic Block Store) volumes are block devices that live on a separate AWS storage fleet, not on the EC2 host. They are attached to your instance over a high-speed network, managed by the Nitro EBS Card. From the guest's perspective, an EBS volume looks like a local NVMe block device. Underneath, every read and write is a network I/O request from the Nitro Card to the EBS storage nodes.

This design gives EBS three important properties. First, durability: EBS volumes replicate data synchronously within an Availability Zone. If the physical drive backing a volume fails, the volume is automatically remounted from a replica without the guest noticing. Second, persistence: an EBS volume exists independently of the instance. Stop or terminate the instance and the volume remains (unless you checked "delete on termination"). This is what makes EBS the correct choice for root volumes and any state you care about. Third, mobility: EBS volumes can be detached from one instance and attached to another in the same AZ, enabling maintenance and migration workflows that are impossible with local storage.

EBS comes in several volume types. gp3 is the current general-purpose SSD: up to 16,000 IOPS and 1,000 MB/s throughput, configurable independently of volume size. io2 Block Express is the high-performance tier: up to 256,000 IOPS with sub-millisecond latency, built on a dedicated high-speed fabric between the Nitro EBS Card and the storage fleet. st1 and sc1 are HDD-backed options for sequential throughput workloads (data warehouses, log archives) where IOPS requirements are low but GB-per-dollar matters.

EBS bandwidth is constrained at two levels. Each instance type has a maximum aggregate EBS bandwidth (a `c7g.xlarge` has 10 Gbps EBS bandwidth; a `c7g.16xlarge` has 30 Gbps). Each volume also has its own IOPS and throughput limits. Your bottleneck will be whichever constraint you hit first. This is a common source of puzzlement when an application is slow: the instance has spare CPU and network capacity, but EBS throughput is saturated.

Instance store: local NVMe

Instance store volumes are physical NVMe drives or SSDs physically attached to the EC2 host server. There is no network hop. The guest accesses them directly through the Nitro hypervisor at near-native speed. Instance store throughput and IOPS significantly exceed what EBS can deliver, and latency is much lower.

The tradeoff is absolute: instance store data is lost when the instance stops, is terminated, or if the underlying host fails. There is no replication, no snapshot mechanism via AWS, and no way to recover. Instance store is ephemeral by design.

The use cases are specific. Instance store is appropriate for buffers, caches, scratch space for data processing, and replicated data where the application layer handles redundancy across multiple instances (Cassandra nodes, HDFS datanodes, Kafka replicas). It is never appropriate as the sole copy of important data.

Instance types with local NVMe are designated by a `d` suffix (`m6id`, `i4i`, `r6id`). The `i` family is specifically designed for maximum local storage I/O: `i4i.32xlarge` offers 30 TB of local NVMe with up to 3.3 million read IOPS.

Request lifecycle: what happens when your app reads from EBS

Let's trace a single read I/O from application to disk to ground all of this in a concrete end-to-end flow.

Your application issues a `read()` syscall. The guest OS kernel handles this as a standard block device request on `/dev/nvme0n1`. The NVMe driver in the guest kernel places an I/O command in a submission queue. The Nitro Hypervisor virtualises access to that queue, passing the command to the Nitro EBS Card via a shared memory region — no CPU interrupt is needed.

The Nitro EBS Card takes the command and issues an NVMe-over-Fabrics (NVMe/TCP or NVMe/RoCE, depending on the host generation) request over AWS's internal network fabric to the EBS storage node holding the requested block. If the volume is encrypted (which is the default since 2023 for most accounts), the Nitro Card handles AES-256 encryption and decryption transparently; the guest never handles plaintext data in transit.

The EBS storage node retrieves the block, returns it to the Nitro EBS Card. The card places the data in the completion queue. The NVMe driver in the guest receives the completion, and the guest kernel returns the data to the application. The round-trip latency for a `gp3` volume at this point is typically 1–3ms. For `io2 Block Express` on a supported instance type, this drops to sub-millisecond.

At no point in this path did the read touch the host CPU for I/O processing. The only CPU involved was the guest's own vCPU executing the application and kernel code. That is the Nitro performance promise made tangible.

Failure modes and fault tolerance

EC2 exposes its failure model through a specific taxonomy. Understanding what can fail, and what AWS does and does not protect you against, is what separates a resilient architecture from one that fails quietly.

Hardware failure: the host

Physical host hardware fails. When AWS detects an underlying hardware failure affecting a running instance, it raises a scheduled maintenance event where possible, or in the case of sudden failure, the instance enters a stopped or terminated state depending on the shutdown behaviour configured. For instances with stop shutdown behaviour, the instance can be restarted on new hardware automatically.

AWS publishes two status checks visible in the EC2 console. The System Status Check tests AWS infrastructure: connectivity to the instance, network reachability, and underlying hardware health. The Instance Status Check tests the guest: whether the OS has booted correctly and is responding. A failing system status check is AWS's problem. A failing instance status check is yours. Both are surfaced as CloudWatch metrics and can trigger automated recovery via EC2 Auto Recovery, which stops and restarts the instance on a new host with all EBS volumes intact.

Availability Zone isolation

An Availability Zone is AWS's fault isolation boundary at the physical layer. Each AZ is one or more data centers with independent power, cooling, and networking. A hardware failure or power event that affects one AZ should not affect another. EBS volumes are AZ-scoped: a volume exists in exactly one AZ and cannot be directly attached to an instance in a different AZ. This is intentional; cross-AZ EBS access would cross a network boundary and introduce latency and cost.

Building resilience against AZ failure requires running instances in multiple AZs with application-layer replication or a load balancer. A single-instance deployment in one AZ has no protection against AZ-level events.

EBS volume failure

EBS volumes replicate within an AZ. A single drive failure does not cause volume failure. However, an AZ-level event or a software bug in the EBS service can make a volume unavailable. EBS snapshots to S3 are the recovery path; they cross the AZ boundary because S3 is a regional service. Taking regular snapshots is the first-line defence against EBS data loss.

Instance store and the stop/terminate event

As noted: instance store data is lost on stop, termination, and host hardware failure. This is not a failure mode to mitigate; it is a fundamental property. If you run workloads on instance store, the application must treat local storage as a cache, not a store, and must be able to reconstruct or replicate data from another source.

Spot interruption

A Spot Instance that is interrupted by AWS gets a two-minute warning via the EC2 instance metadata service at `http://169.254.169.254/latest/meta-data/spot/interruption-notice`. A well-designed Spot workload polls this endpoint and triggers graceful shutdown: checkpointing state, draining in-flight work, and deregistering from load balancers. This is the correct architecture for batch processing, CI runners, and stateless web tiers running on Spot.

The complete picture

Summary

EC2 is a virtualisation platform built on a purpose-designed hardware stack. The Nitro System is the central architectural insight: by moving all I/O processing off the host CPU onto dedicated ASIC cards, AWS eliminated the overhead of the Dom0 era and made the "100% of host resources" promise credible. Every component you interact with as an EC2 user — network bandwidth, EBS throughput, security group enforcement, instance isolation — is implemented in Nitro hardware, not software running on the same CPU you are billed for.

The instance lifecycle is managed by a regional control plane that selects hosts based on capacity, type, AZ, and placement constraints. Stopped instances release their host reservation; do not assume the same physical host on restart. Instance types encode hardware characteristics directly in their names.

Networking is rooted in the VPC and implemented by the Nitro VPC Card. Security groups are hardware-enforced, not a guest-side firewall. ENA provides the high-throughput NIC interface. Placement groups give you control over physical proximity and fault isolation.

Storage comes in two forms with fundamentally different tradeoffs. EBS is persistent, replicated, and network-attached via the Nitro EBS Card; use it for everything you cannot afford to lose. Instance store is ephemeral, local, and fast; use it as a cache or for replicated workloads that handle their own redundancy.

Fault tolerance in EC2 requires explicit design. AWS handles host hardware recovery via Auto Recovery and scheduled maintenance events, but AZ-level resilience requires multi-AZ architecture, and data durability requires EBS snapshots or application-level replication. Understanding the failure boundaries is what separates robust production deployments from systems that fail silently and expensively.

Related on this blog: Architecture series