August 25, 2019

Amazon ECS - Deep dive & Demystify ECS Optimized AMI

AWS Elastic Container Service(ECS) makes it very easy for anyone to run and manage their docker container applications on cloud. All one has to do is create a cluster and provide the name of the docker image to pull and run as a service in the cluster.

There's a lot of magic happening behind the scene and its not necessary for everyone to know. But if you are curious, this article is for you. Here we take a deep dive into how AWS ECS works and all the internal workings of ECS. This will definitely help one understand ECS better and also take a sneak peek at the secret sauce behind AWS ECS.

ECS: 10,000 Foot Overview

1. All docker container images are tagged and stored in AWS ECR (Elastic Container Repository)
2. ECS Task Definition contains details about the docker image location and the tag that needs to be run on the ECS as a container.
3. ECS reads the task definition and schedules the tasks on the EC2 instances registered with it.
4. Each EC2 instance(usually running Amazon Linux or Amazon Linux 2 ECS Optimized AMI) has a ECS-AGENT which pulls the docker image from ECR and runs it on the EC2 instance.

This looks dead simple. The EC2 instances are usually managed through auto-scaling group, allowing the number of tasks/containers running in the ECS cluster to group or shrink based on different cloudwatch metrics associated with the application.
AWS ECS
Fig 1: AWS ECS

ECS: Deep Dive

Running containers in ECS can be broken down into two parts
1. Scheduling: ECS takes care of scheduling container(knows as tasks in ECS) on different EC2 instances.
2. Orchestration: Once the task is scheduled on EC2, the ECS-AGENT takes care of the running the image as a container.
The bulk of the magic happens within the "ECS Optimized Machine Image" that is running on the EC2 instances !!!>

AWS ECS Optimized AMI

AWS ECS Optimzied AMI X Ray
Fig 2: Internals of ECS Optimized AMI (Amazon Linux 2)
The ECS Optimized AMI is:
1. Amazon Linux or Amazon Linux 2 (based on CentOS Kernel)
2. Docker
3. Comes pre-loaded with the following core packages:

The machine has Docker running as daemon. There is also a start up script which pulls and runs the ECS-Agent as a dameon. ECS-Agent is the core of the Elastic Container Service and its necessary for it to keep it running.

The following are the tasks that the ECS-Agent does:

1. Parse the ECS Task Definition details, pull the docker image from ECR and run the image on the EC2 instance based on the run config provided in the ECS Task Definition.
2. Clean up old, unused images of applications that were run on the EC2 instance.
3. Rotate logs files generated by ECS-Agent.
4. Integrate with AWS IAM to generate temporary credentials based on the IAM Role associated with the ECS TASK (configured in task definition) and make it available to the running ECS Task.

ECS Agent Service
Fig 3: ECS Agent Service [/etc/systemd/system/ecs-agent.service]

Cloud-Init

The cloud-init package is which is now a defacto on most of the linux distributions. This package performs early initialization of cloud instances.
This package is also responsible for running the "user-data" scripts that are provided during the EC2 instance start up. Without the cloud-init package the user-data script won't run. The user data provided to the EC2 instance is stored under "/var/lib/cloud/instance/user-data.txt" .
Cloud-init also enabled one to configure scripts that can be run every time the OS boots on the EC2 instance or per start of the instance or once. Scripts can be placed in the appropriate directory under "/var/lib/cloud/scripts"
Instance information is fetched from AWS EC2 metadata endpoint and stored in /run/cloud-init/instance-data.json
Cloud Init - User Data         Cloud Init - Scripts
Fig 4: Cloud-Init Directories

Container Storage Setup

ECS optimized AMI has in-built scripts to spin up a 22GB EBS volume and attach it to the EC2 instance as part of the boot up process. This additional volume is used as a scratch space docker, the ECS-AGENT container and other containers that would run on the EC2 instances. The additional volume is not used for storing anything related to the operating system running on EC2.
Container Storage Setup
utility is used to prepare and setup the volume for use with docker storage driver. Docker versions >= 1.8

AMI Docker Version Docker Storage Driver Service & System Manager
ECS Optimized Amazon Linux < 17.06.02-ee5 Device Mapper (Based on Logical Volume Mounts) /etc/init.d
ECS Optimized Amazon Linux 2 >= 17.06.02-ee5 Overlay2( Much faster than device-mapper driver) systemd
ECS Optimized AMI - Volumes
Fig 5: Volumes on ECS Optimized AMI based EC2 instances

Other observations


1. ECS-Agent container and any other containers on the EC2 should be run with HOST networking mode otherwise the container will not be able to reach the AWS container meta-data endpoint to retrieve the ECS Task IAM roles or other metadata related to the container.