Adrian Cantrill’s SAA-C02 study course: HA & Scaling section: ASG Scaling Policies
In this lesson we looked at auto scaling group scaling policies. To begin with, we learned that auto scaling groups don’t need scaling policies; they can have none and they work just fine. If there are no scaling policies, it just means and ASG has static values for min size, max size, and desired capacity.
When you hear the term manual scaling, that actually refers to when you manually adjust these values. This is useful in urgent or testing situations or when you need to hold capacity at a fixed number of instances. For example, as a cost control measure.
In addition to manual scaling we also have different types of dynamic scaling which allow you to scale the capacity of your auto scaling group in response to changing demand. There are a few different types of dynamic scaling which will be review below. At a high level each of these adjust the desired capacity of an Auto Scaling Group, based on a certain criteria.
Simple scaling: Defines actions which occur when an alarm moves into an alarm state. This helps infrastructure scale out and in based on demand. The problem is that this scaling is inflexible. It’s adding or removing a static amount based on the state of an alarm; it’s simple, but not very efficient.
Step scaling: Increases or decreases the desired capacity based on a set if scaling adjustments known as step adjustments that vary based on the size of the alarm breach. You can define upper and lower bounds. Step scaling is better than simple because it allows you to adjust to changing load patterns on the system.
Target tracking: Comes with a predefined set of metrics. Currently, this is cpu utilization, average network in, average network out and ALB request count per target. The premise is simple enough: you define an ideal value, the target that you want to track against for that metric. The auto scaling group keeps the metric at the value you want and it adjust the capacity as required to make that happen. The further away the actual value of the metric is from your target value, the more extreme the action, either adding or removing compute.
Scaling based on SQS – Approximate number of messages visible: This is a common architecture for a worker pool where you can increase or decrease capacity based on approximate number of messages visible. As more messages are added to the queue, the Auto Scaling Group increases in capacity to process messages and then as the queue empties the group scales back to reduce costs.
One area of confusion is the difference between simple scaling and step scaling. AWS recommends step scaling versus simple at this point in time, and it’s important to understand why. With simple scaling we create or use an existing alarm as a guide. This works, but it’s not very flexible. With simple scaling you’re adding or removing the same amount no matter how extreme the increases and decreases in the metric that you’re monitoring.
With step scaling, you’re more flexible. With step scaling you’re still checking an alarm, but for step scaling you can define rules with steps. We always have the minimum number of instances as defined in the auto scaling group. Step scaling is great for variable load where you need to control how systems scale in and out. It allows you to handle large increases and decreases in load much better than simple scaling. So, based on how extreme the increase or decrease is determines how many units of compute are added or removed. It’s not static like simple scaling. That’s the main difference between simple and step, the ability to scale in different ways based on how extreme the load changes are.