Friday 12/17/21 Cloud Studies Update: AWS HA & Scaling

Adrian Cantrill’s SAA-C02 study course, 60 minutes: AWS HA & Scaling section: ‘Regional and Global AWS Architecuture’, ‘Evolution of the Elastic Load Balancer’

Regional and Global AWS Architecture

The lesson started with a look at global applications, with an emphasis on Netflix. It was explained that Netflix is a global application, but it’s also a collection of smaller regional applications which make up the Netflix global platform: discrete blocks of infrastructure which operate independently and are duplicated across different regions around the world.

The next things we discussed were three distinct types of architectures:

1. small scaled architectures, which only exist in one region or one country

2. systems which exist in one region or country, but which have a DVR requirement; if the region fails, it fails over to a secondary region

3. systems that operate within multiple regions and need to operate through failure in one or more of those regions

After this we looked at major architectural components which map onto AWS products and services:


– Global Service Location & Discovery (when you type into your browser, how does your machine discover where to point at)

– Content Delivery (CDN) and optimisation (how does the content or data for an application get to users globally; are there pockets of storage distributed globally, or is it pulled from a central location)

– Global Health Checks & Failover (detecting if infrastructure in one location is healthy or not and moving customers to another country as required)


– Regional Entry Point:

– Regional Scaling & Resilience

– Application Services and Components

Some points:

– Globally DNS is used for service discovery and regional based health checks and request routing

– DNS could be configured to point at one or more service endpoints

– Another valid configuration is to send customers to their nearest location

– The key thing for the global architecture is that it has health checks

– Regardles off where regions are located, a content delivery network can be used at the global level

– This ensures that content is cached locally, as close to customers as possible

– All the cache locations are located globally, and pull content from the origin location as required

– With this global perspective, the function of the architecture at this level being to get customers through to a suitable infrastructure location, making sure any regional failures are isolated and sessions moved to alternative regions. The attempt is to direct customers to a local region, at least if the business has multiple locations, and it attempts to improve caching using a content delivery network such as CloudFront. If this part of the architecture works well customers will be redirected towards a region that has infrastructure for the application.

If, for instance, the region is one of the US ones, traffic is entering one specific region, either a VPC or using public space AWS services. It was pointed out that the most effective way of thinking about systems architecture from a regional standpoing is as a collection regions making up a whole. If you think about AWS products and services, very few are actually global. Most of them run in a region, and many of those regions make up AWS. Thinking in this way is more efficient, and it makes designing a large platform much easier.

Initially, communications from your customers will generally enter at the web tier. Generally this will be a regional based AWS service, such as an application load balancer or API gateway, depending on the architecture that the application uses. The purpose of the web tier is to act as an entry point for your regional based applications or application components. It abstracts your customers away from the underlying infrastructure. It means that the infrastructure behind it can scale or fail or change without impacting customers. The functionality provided to the consumer via the web tier is provided by the compute tier, using services such as EC2, Lambda or containers which use the elastic container service. So, for example a load balancer might use EC2 to provide compute services through to the customers.

The compute tier will consume storage services, another part of all AWS architectures, another part of all AWS services, and this tier will use services such as EBS, EFS, and even S3 for things like media storage. Many global architectures also utilize CloudFront, the global content delivery network within AWS, and CloudFront is capable of using S3 as an origin for media.

All these tiers are separate components from one another and can consume services from each other. For instance, CloudFront coud directly access S3, to fetch content for delivery to a global audience.

In addition to file storage, most environments require data storage, and within AWS, this is delivered using products like AWS Aurora, DynamoDB, and Redshift for data warehousting. In order to improve performance, most applications don’t directly access the database, instead they go via a caching layer, products like Elasticache for general caching, or DynamoDB Accelerator, known as DAX, when using DynamoDB. This way, reads to the database can be minimized. Applications will instead consult the cache first, and only if the data isn’t present in the cache, will the database be consulted and the contents of the cache updated. Because caching is generally in memory, it’s cheap and fast. Databases tend to be expensive based on the volume of data required vs. cache and normal data storage. So, where possible, you need to offload reads from the database into the caching layer to improve performance and reduce costs.

Lastly, AWS has a suite of products designed specifically to provide application services, things like Kinesis, StepFunctions, SQS, and SNS, all of which provide some type of functionality to applications, either simple functionality like email or notifications or functionality which can change an applications architecture such as when you decouple components using queues.

We will be looking at all of these components and how we can use them together as solutions architects to build platforms. We need to get used to thinking of architectures from a global and regional perspective, as well as understanding that application architecture is generally built using components from all of the different tiers: the web tier, the compute tier, caching, storage, database, and application services.

Evolution of the Elastic Load Balancer:

Elastic Load Balancer was introduced in 2009 with the Classic Load Balancer. Now there are currently three different types of Elastic Load Balancers available within AWS. If you see the term ELB or elastic load balancers, then it refers to the whole family, all three of them. The load balancers are split between version one and version two. At this point you should avoid using the version one load balancer and aim to migrate off them onto version two products, which should be preferred for any new deployments.

The original ‘Classic Load Balancer’ can load balance HTTP and HTTPS, as well as lower level protocols, but they aren’t really layer 7 devices. They don’t really understand HTTP, and they can’t make decisions based on HTTP protocol features. They lack much of the advanced functionality of the version two load balancers, and they can be significantly more expensive to use. One common limitation is that classic load balancers only support one SSL certificate per load balancer, which means for larger deployments you might hundreds or thousands of classic load balancers, and these could be converted down to a single version two load balancer. Remember: for any questions or real world situations, you should not default to using classic load balancers.

Now to the new version two load balancers. The first is the Application Load Balancer or ALB, and these are truly layer 7 devices, so application layer devices. They support HTTP, HTTPS, and the web socket protocols. They’re generally the type of load balancer that you’d pick for any scenarios that use any of these protocols.

There’s also Network Load Balancers or NLB’s, which are also version two devices, but these support TCP, TLS, and UDP protocols. So network load balancers are the type of load balancers that you would pick for any applications, which don’t use HTTP or HTTPS. For example, if you wanted to load balance email servers or SSH servers, or a game which used a custom protocol, so didn’t use HTTP or HTTPS, then you would use a network load balancer.

In general, version 2 load balancers are faster and support target groups and rules, which allow you to use a single load balancer for multiple things or handle the load balancing different based on which customers are using it. For the exam you really need to be able to pick between network load balancers and application load balancers for a specific situation.  

Published by pauldparadis

Working towards cloud networking security as a profession.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: