A pictorial representation of AWS EBS Architecture
I’m not known for creating pretty pictures & this is definitely not a pretty one but hopefully it will help visualise how AWS EBS fits together. I’m hoping someone will feel so appalled at my terrible diagram they’ll feel obliged to come up with a pretty one .
I drafted this after reading the incredibly detailed post mortem on the EBS problems AWS experienced in the US-east region on the 21st April 2011 where they explained the EBS architecture.
I have pulled out the following points from the Post mortem message to help understand how a normally functioning EBS cluster works:
An EBS Cluster exists within an Availaibility Zone
An EBS Cluster manages a set of EBS Nodes
The EBS Nodes store replicas of EBS volume data and serve read & write requests to EC2
EBS Nodes Communicate with other EBS nodes, with EC2 instances, and with the EBS control plane services is via a high bandwidth network
A secondary lower capacity network is also in use that is used as a back-up network to allow EBS nodes to reliably communicate with other nodes in the EBS cluster and to provide overflow capacity for data replication
If an EBS node loses connectivity to a node to which it is replicating data to, it assumes the other node failed. To preserve durability, it must find a new node to which it can replicate its data (this is called re-mirroring). As part of the re-mirroring process, the EBS node searches its EBS cluster for another node with enough available server space, establishes connectivity with the server, and propagates the volume data
The control plane services accepts user requests and propagates them to the appropriate EBS cluster. There is one set of EBS control plane services per EC2 Region, but the control plane itself is highly distributed across the Availability Zones to provide availability and fault tolerance. These control plane services also act as the authority to the EBS clusters when they elect primary replicas for each volume in the cluster (for consistency, there must only be a single primary replica for each volume at any time)