Auto scaling Java REST APIs using Amazon ECS with Fargate
Get to know more about using the ECS service with Fargate serverless compute engine to run dockerized Java REST APIs, and be able to automatically scale them.
In this blog post, the following AWS services are used:
For auto scaling Java REST APIs using the EC2 Auto Scaling service see the previous blog post. Some of the auto scaling and application load balancing concepts that are covered in the previous blog post also apply to the ECS with Fargate service, and thus will not be covered here.
ECS with Fargate Service
When using ECS, you need to choose between two types of compute engines:
EC2
Fargate
Fargate is a serverless compute engine, and thus there are no virtual server machines (EC2 instances) to manage when running your web applications and (micro) services.
In ECS you'll need to create a cluster. A cluster contains one or more services. Each service represents a task definition, and allows additional features such as auto scaling to apply to a task definition. A service can also make use of an application load balancer which can route the traffic to multiple running tasks part of the service. A task definition is the template of a task. It contains one or more container definitions for docker images to run in docker containers when a task is run. Related container definitions can be grouped together in one task definition for which a service is created with auto scaling enabled to allow multiple tasks to run.
Within the ECS cluster, it is also possible to run tasks without creating a service first. However, auto scaling among other features cannot be applied to the task.
See below for a class model that represents the relationships among the entities described above.
ECS with Fargate sandbox environment
Setting up the sandbox environment
Components and Services
The sandbox environment consists of the following components and services:
Apollo Missions API
Apache JMeter is used for load testing purposes.
Elastic Container Service (ECS)
Cluster
Task Definition
Service
Elastic Container Registry (ECR)
Repository
Docker Image
Elastic Load Balancing (ELB)
Target Group
Load Balancer
CloudWatch
Apollo Missions API
Apollo Missions API is a simple REST API written in Java using the Quarkus framework. This Java application runs in a docker container as part of a task launched by the ECS. Later on, I will describe how to create a docker image, and upload it to ECR to be used by ECS.
The application consists of the following endpoints:
/missions/manned
/missions/manned/{missionId}
/longComputation
/health
The first two endpoints provide some basic data regarding the manned Apollo Missions. The /longComputation endpoint is used by the Apache JMeter for load testing purposes. Creating load on the ECS service will trigger an alarm in CloudWatch, and cause a scaling policy to take a scale out action. The /health endpoint is used by the ELB load balancer to monitor the health of the Java application (whether it is running, accepting and successfully processing HTTP requests or not).
Elastic Container Service (ECS)
A cluster is created which will run the sandbox components. Furthermore, a task definition is created from which tasks can be created to run the REST API microservice in a docker container. Finally, a service in ECS is created to run the tasks and enable the auto scaling of the REST API microservice.
Elastic Container Registry (ECR)
A repository is created to which a docker image for the REST API microservice is uploaded. This docker image will be referenced in the task definition of ECS.
Elastic Load Balancing (ELB)
A target group is created with health checks. The /health endpoint of the Java application is used by the health checks. The target group allows the created application load balancer to route the HTTP traffic to the ECS tasks launched by the ECS Auto Scaling service.
CloudWatch
You can use this service to see which alarms went off, and view the ECS and ELB statistics such as for instance the CPU utilization and Request Count Sum.
Setup
The setup in AWS is done using the AWS Management Console. The values mentioned here regarding Availability Zones (AZs) are for the Europe (Frankfurt) eu-central-1 region. You can of course use a different region but then you will need to provide the values for the default public subnets and AZs of your chosen region.
Docker Image in ECR
Create a docker image of the REST API microservice. Within the Apollo Missions API code repository, look for section Run Quarkus in JVM mode in a docker container.
Now, create a repository in ECR (e.g. isaacdeveloperblog) and upload the version 1.0.0 of the REST API to the ECR repository. Once uploaded, click on the image and copy the Image URI. You will need this URI for your task definition in ECS.
ECS Cluster
Navigate to the ECS and click on Clusters. In the Clusters section click on the Create Cluster button. Choose for Networking only which is powered by AWS Fargate.
Provide the following values.
Cluster name
isaacdeveloperblog
Create VPC
Tick the box ‘Create a new VPC for this cluster’.
Leave the default values for CIDR block, Subnet 1 and Subnet 2 as is.
Now click on the Create button and a new ECS cluster will be created for the sandbox environment. It may take a couple of minutes before all resources within the cluster are created.
Task Definition
Go to the Task Definitions section, and click on the Create new Task Definition button. For the launch type select Fargate.
Provide the following values, and create the task definition.
Task Definition Name
apollo-missions-api
Task Role
None (default)
Network Mode
awsvpc (default)
Task execution role
ecsTaskExecutionRole (default)
Task memory (GB)
2GB
Task CPU (vCPU)
1 vCPU
Container Definitions - Container Name
apollo-missions-api
Container Definitions - Image
Paste here your uploaded Image URI.
Example: 984818921620.dkr.ecr.eu-central-1.amazonaws.com/isaacdeveloperblog/apollo-missions-api:1.0.0
Container Definitions – Port mapping
Container port: 8080
Protocol: tcp
Elastic Load Balancing
Navigate to the EC2 service and click on the Load Balancers section. Click on the Create Load Balancer button, and select the Application Load Balancer.
Now create an ELB application load balancer with the below values which need to be specified or are different from the default values.
Step 1: Configure Load Balancer
Name
apollo-missions-api-lb
Listeners – Load Balancer Protocol
HTTP: 80
VPC
Select the VPC which belongs to the ECS cluster.
Availability Zones
Select all available AZs:
eu-central-1a
eu-central-1b
Step 2: Configure Security Settings
Step 3: Configure Security Groups
Create a new security group
Security group name: apollo-missions-api-lb-sg
Type: HTTP
Protocol: TCP
Port Range: 80
Source: Anywhere
Step 4: Configure Routing
Create a new target group
New target group
Name: apollo-missions-api-tg
Target type: IP
Protocol: HTTP
Port: 8080
Health checks
Protocol: HTTP
Path: /health
Service
Go to the Services tab within the newly created ECS cluster, and click on the Create button.
Provide the following values, and create the service.
Step 1: Configure service
Launch type
Fargate
Task Definition
Family: apollo-missions-api
Revision: 1 (latest)
Platform version
LATEST (default)
Cluster
isaacdeveloperblog
Service name
apollo-missions-api
Service type
REPLICA (default)
Number of tasks
1
Minimum healthy percent
100 (default)
Maximum percent
200 (default)
Deployment type
Rolling update (default)
Step 2: Configure network
Cluster VPC
Select the VPC which belongs to the ECS cluster.
Subnets
Select all available subnets of the VPC.
Configure Security Groups
Create a new security group
Security group name: apollo-missions-api-ecs-sg
Type: Custom TCP
Protocol: TCP
Port Range: 8080
Source: Anywhere
Auto-assign public IP
ENABLED (default)
Health check grace period
30
Load balancing
Application Load Balancer
Load balancer name: apollo-missions-api-lb
Container to load balance
Container name : port: apollo-missions-api:8080:8080
Click on the Add to load balancer button.
Production listener port: 80:HTTP
Target group name: apollo-missions-api-tg
Step 3: Set Auto Scaling
Service Auto Scaling
Choose for ‘Configure Service Auto Scaling to adjust your service’s desired count’.
Minimum number of tasks
1
Desired number of tasks
1
Maximum number of tasks
10
IAM role for Service Auto Scaling
ecsAutoscaleRole (default)
Scaling policy type
Target tracking
Policy name
CPU-Utilization
ECS service metric
ECSServiceAverageCPUUtilization
Target value
50
Scale-out cooldown period
60 seconds
Scale-in cooldown period
60 seconds
It may take some time before the service is created. When the service is created, a task will be started by the service.
Running the sandbox environment
If you’ve successfully setup the sandbox environment, you should be able to see the first ECS task launched by the ECS Auto Scaling service to meet the desired capacity of 1.
Wait until the last status becomes RUNNING.
Upon launching the ECS task, the ECS Auto Scaling service has registered the task with the Target Group so that the application load balancer can route the traffic to the task.
You can access the REST API of that task directly, or via the application load balancer.
ECS Task: http://<TASK network interface public IPv4 DNS >:8080/missions/manned
ELB: http://<ELB load balancer’s DNS name>/missions/manned
Now click on the apollo-missions-api service in ECS and click on the Events tab. In the events tab you can see that the cluster has only started one task and reached a steady state.
Let's update the ECS service, and manually set the minimum and the number of desired tasks to 2. Once done, verify in the events tab that a new task has been created.
To trigger a scale out action by the ECS Auto Scaling service, you will need to put some extra load on the ECS service. You can use the provided JMeter project as part of the Apollo Missions API source code. The project file is located in the resources/jmeter folder. This project has been created with Apache JMeter version 5.2.1.
Once the project is opened in JMeter, provide the ELB load balancer’s DNS name for the field Server Name or IP of the /longComputation endpoint.
Hit the play button to send HTTP requests to the ELB load balancer. Ten concurrent users will for a period of 20 minutes continuously send HTTP requests.
Click on the Summary Report and watch the statistics.
In CloudWatch you can see that two alarms which are created by the ECS Auto Scaling service are in OK state.
After some time, the alarm for the condition ‘CPUUtilization > 50 for 3 datapoints within 3 minutes’ will go off, and this event will trigger a scale out action in ECS.
In ECS service, in the Events tab we can see the new log entries mentioning the scale out action.
If we now look in the Tasks tab, we can see that two additional tasks have been launched.
The number of ECS tasks increases gradually over time to meet the dynamically adjusted desired capacity by the ECS Auto Scaling service. Once we stop the JMeter tests, the CPU utilization will drop drastically, and thus multiple scale in actions will take place to decrease the number of running ECS tasks until only the minimum number of tasks are left to run.