This is the eleventh part of Building Microservice Applications with Azure Container Apps and Dapr. The topics we’ll cover are:
- Tutorial for building Microservice Applications with Azure Container Apps and Dapr – Part 1
- Deploy backend API Microservice to Azure Container Apps – Part 2
- Communication between Microservices in Azure Container Apps – Part 3
- Dapr Integration with Azure Container Apps – Part 4
- Azure Container Apps State Store With Dapr State Management API – Part 5
- Azure Container Apps Async Communication with Dapr Pub/Sub API – Part 6
- Azure Container Apps with Dapr Bindings Building Block – Part 7
- Azure Container Apps Monitoring and Observability with Application Insights – Part 8
- Continuous Deployment for Azure Container Apps using GitHub Actions – Part 9
- Use Bicep to Deploy Dapr Microservices Apps to Azure Container Apps – Part 10
- Azure Container Apps Auto Scaling with KEDA – (This Post)
- Azure Container Apps Volume Mounts using Azure Files – Part 12
- Integrate Health probes in Azure Container Apps – Part 13
Azure Container Apps Auto Scaling with KEDA
In this post, we will explore how we can configure Auto Scaling rules in Container Apps. In my opinion, the Auto Scaling feature is one of the key features of any Serverless hosting platform, you want your application to respond dynamically based on the increased demand on workloads to maintain your system availability and performance.
Container Apps support Horizontal Scaling (Scaling Out) by adding more replicas (new instances of the Container App) and splitting the workload across multiple replicas to process the work in parallel. When the demand decrease, Container Apps will (Scale In) by removing the unutilized replicas according to your configured scaling rule. With this approach, you pay only for the replicas provisioned during the increased demand period, and you can as well configure the scaling rule to scale to Zero replicas, which means that no charges are incurred when your Container App scales to zero.
Azure Container Apps supports different scaling triggers as the below:
- HTTP traffic: Scaling based on the number of concurrent HTTP requests to your revision.
- CPU or Memory usage: Scaling based on the amount of CPU utilized or memory consumed by a replica.
- Azure Storage Queues: Scaling based on the number of messages in Azure Storage Queue.
- Event-driven using KEDA: Scaling based on events triggers, such as the number of messages in Azure Service Bus Topic or the number of blobs in Azure Blob Storage container.
As I covered in the initial posts, Azure Container Apps utilize different open source technologies, KEDA is one of them to enable event-driven autoscaling, which means that KEDA is installed by default when you provision your Container App, we should not worry about installing it. All we need to focus on is enabling and configuring our Container App scaling rules.
In this post, I will be focusing on event-driven autoscaling using KEDA.
The source code for this tutorial is available on GitHub. You can check the demo application too.
An overview of KEDA
KEDA stands for Kubernetes Event-Driven Autoscaler. It is an open-source project initially started by Microsoft and Red Hat to allow any Kubernetes workload to benefit from the event-driven architecture model. Prior to KEDA, horizontally scaling Kubernetes deployment was achieved through the Horizontal Pod Autoscaler (HPA). The HPA relies on resource metrics such as Memory and CPU to determine when additional replicas should be deployed. Within any enterprise application, there will be other external metrics we want to scale out our application based on, think of Kafka topic log, length of an Azure Service Bus Queue, or metrics obtained from a Prometheus query. KEDA offers more than 50 scalers to pick from based on your business need. KEDA exists to fill this gap and provides a framework for scaling based on events in conjunction with HPA scaling based on CPU and Memory.
Configure Scaling Rule in Backend Background Processer Project
We need to configure our Backend Background Processer named “tasksmanager-backend-processor” service to scale out and increase the number of replicas based on the number of messages in the Topic named “tasksavedtopic”. If our service is under a huge workload and our single replica is not able to cope with the number of messages on the topic, we need the Container App to spin up more replicas to parallelize the processing of messages on this topic.
So our requirements for scaling the backend processor are as follows:
- For every 10 messages on the Azure Service Bus Topic, scale-out by one replica.
- When there are no messages on the topic, scale-in to a Zero replica.
- The maximum number of replicas should not exceed 5.
To achieve this, we will start looking into KEDA Azure Service Bus scaler, This specification describes the azure-servicebus trigger for Azure Service Bus Queue or Topic, let’s take a look at the below yaml file which contains a template for the KEDA specification:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
triggers: - type: azure-servicebus metadata: # Required: queueName OR topicName and subscriptionName queueName: functions-sbqueue # or topicName: functions-sbtopic subscriptionName: sbtopic-sub1 # Optional, required when pod identity is used namespace: service-bus-namespace # Optional, can use TriggerAuthentication as well connectionFromEnv: SERVICEBUS_CONNECTIONSTRING_ENV_NAME # This must be a connection string for a queue itself, and not a namespace level (e.g. RootAccessPolicy) connection string [#215](https://github.com/kedacore/keda/issues/215) # Optional messageCount: "5" # Optional. Count of messages to trigger scaling on. Default: 5 messages cloud: Private # Optional. Default: AzurePublicCloud endpointSuffix: servicebus.airgap.example # Required when cloud=Private |
Let’s review the parameters:
- The property type is set to “azure-servicebus”, each KEDA scaler specification file has a unique type.
- One of the properties queueName or topicName should be provided, in our case, it will be “topicName” and we will use the value “tasksavedtopic”.
- The property subscriptionName will be set to use “tasksmanager-backend-processor” This represents the subscription associated with the topic. Not needed if we are using queues.
- The property connectionFromEnv will be set to reference a secret stored in our Container App, we will not use the Azure Service Bus shared access policy (connection string) directly, the shared access policy will be stored in the Container App secrets, and the secret will be referenced here. Please note that the Service Bus Shared Access Policy needs to be of type Manage. It is required for KEDA to be able to get metrics from Service Bus and read the length of messages in the queue or topic.
- The property messageCount is used to decide when scaling out should be triggered, in our case, it will be set to 10.
- The property cloud represents the name of the cloud environment that the service bus belongs to.
Note about authentication: KEDA scaler for Azure Service Bus supports different authentication mechanisms such as Pod Managed Identity, Azure AD Workload Identity, and shared access policy (connection string). When using KEDA with Azure Container Apps, at the time of writing this post, the only supported authentication mechanism is Connection Strings.
Azure Container Apps has its own proprietary schema to map KEDA Scaler template to its own when defining a custom scale rule, you can define this scaling rule via Container Apps ARM templates, yaml manifest, Azure CLI, or from Azure Portal. In this post, I will cover how to do it from the Azure Portal and Azure CLI.
Step 1: Create a new secret in Container App
Let’s now create a secret named “svcbus-connstring” in our Container App named “tasksmanager-backend-processor”, this secret will contain the value of Azure Service Bus shared access policy (connection string) with “Manage” policy. To do so, run the below commands in Azure CLI to get the connection string, and then add this secret using the second command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
##List Service Bus Access Policy RootManageSharedAccessKey $RESOURCE_GROUP = "tasks-tracker-rg" $NamespaceName = "tasksTracker" az servicebus namespace authorization-rule keys list ` --resource-group $RESOURCE_GROUP ` --namespace-name $NamespaceName ` --name RootManageSharedAccessKey ` --query primaryConnectionString ` --output tsv ##Create a new secret named 'svcbus-connstring' in backend processer container app az containerapp secret set ` --name $BACKEND_SVC_NAME ` --resource-group $RESOURCE_GROUP ` --secrets "svcbus-connstring=<Connection String from Service Bus>" |
Step 2: Create a Custom Scaling Rule from Azure CLI
Now we are ready to add a new custom scaling rule to match the business requirements, to do so we need to run the below Azure CLI command:
Note: I had to update az containerapp extension in order to create a scaling rule from CLI, to update it you can run the following command az extension update --name containerapp
1 2 3 4 5 6 7 8 9 10 11 12 13 |
az containerapp update ` --name $BACKEND_SVC_NAME ` --resource-group $RESOURCE_GROUP ` --min-replicas 0 ` --max-replicas 5 ` --scale-rule-name "queue-length" ` --scale-rule-type "azure-servicebus" ` --scale-rule-auth "connection=svcbus-connstring" ` --scale-rule-metadata "topicName=tasksavedtopic" ` "subscriptionName=tasksmanager-backend-processor" ` "namespace=tasksTracker" ` "messageCount=10" ` "connectionFromEnv=svcbus-connstring" |
What we have done is the following:
- Setting the minimum number of replicas to Zero, means that this Container App could be scaled-in to Zero replicas if there are no new messages on the topic.
- Setting the maximum number of replicas to 5, means that this Container App will not exceed more than 5 replicas regardless of the number of messages on the topic.
- Setting a friendly name for the scale rule “queue-length” which will be visible in Azure Portal.
- Setting the scare rule type to “azure-servicebus”, this is important to tell KEDA which type of scalers our Container App is configuring.
- Setting the authentication mechanism to type “connection” and indicating which secret reference will be used, in our case “svcbus-connstring”
- Setting the “metadata” dictionary of the scale rule, those matching the metadata properties in KEDA template we discussed earlier.
Once you run this command the custom scale rule will be created, we can navigate to the Azure Portal and see the details.
Step 3: Create a Custom Scaling Rule from the Azure Portal
Select your Container App named “tasksmanager-backend-processor” and navigate to the tab named “Scale”, select your minimum and maximum replica count, and then click on “Add Scale Rule”, set the values as below, and click on Save. This will create a new revision for the Container App with the scale rule applied.
Step 4: Run an end-to-end test and generate a load of messages
Now we are ready to test out our Azure Service Bus Scaling Rule, to generate a load of messages we can do this from Service Bus Explorer under our Azure Service Bus namespace, so navigate to Azure Service Bus, select your topic/subscription, and then select “Service Bus Explorer”.
The message structure our backend processor expects is as JSON below, so copy this message and click on Send messages button, paste the message content, set the content type to “application/json”, check the “Repeat Send” check box, select 500 messages and put an interval of 5 ms between them, click “Send” when you are ready.
1 2 3 4 5 6 7 8 9 10 11 12 |
{ "data": { "isCompleted": false, "isOverDue": false, "taskAssignedTo": "temp@mail.com", "taskCreatedBy": "Readiness Prob", "taskCreatedOn": "2022-08-18T12:45:22.0984036Z", "taskDueDate": "2022-08-19T12:45:22.0983978Z", "taskId": "6a051aeb-f567-40dd-a434-39927f2b93c5", "taskName": "Health Readiness Task" } } |
Step 5: Verify that multiple replicas are created
If all is setup correctly, 5 replicas will be created based on the number of messages we generated into the topic, there are various ways to verify this:
- You can verify this by looking into the “Live Metrics” within Application Insights, you will see instantly that there are 5 “tasksmanager-backend-processor” provisioned to work in parallel and consume the messages:
- You can verify this from Container Apps “Console” tab you will see those replicas in the drop-down list:
- You can use the below Azure CLI command to list the names of replicas:
1 2 3 4 5 |
##Query Number & names of Replicas az containerapp replica list ` --name $BACKEND_SVC_NAME ` --resource-group $RESOURCE_GROUP ` --query [].name |
Note about KEDA Scale In:
Container Apps implements the KEDA ScaledObject with the following default settings:
- pollingInterval: 30 seconds. This is the interval to check each trigger on. By default, KEDA will check each trigger source on every ScaledObject every 30 seconds.
- cooldownPeriod: 300 seconds. The period to wait after the last trigger is reported active before scaling in the resource back to 0. By default, it’s 5 minutes (300 seconds).
Currently, there is no way to override this value, yet there is an open issue on the Container Apps repo and the PG is tracking it, 5 minutes might be a long period to wait for instances to be scaled in after they finish processing messages.
That’s it for now, hopefully, you find KEDA framework easy to work with to scale your Container Apps.
Leave a Reply