Adrian Cantrill’s SAA-C02 study course, 60 minutes: HA & Scaling section: ELB part 2
Elastic Load Balancer Architecture – Part 2
For this lesson, we looked at Load Balancing in a mutli-tiered organization. We looked at a specific scenario that started with a VPC and two availability zones inside the VPC. From there we added an internet-facing load balancer, a web instance auto-scaling group providing the front-end capability of the application, an in internal load balancer with only private IP addresses allocated to the nodes, an autoscaling group for the application instances, which would be used by the web servers for the application, and then a pair of database instances, Aurora database instances. So, all in all, we were looking at three tiers: web, application, and database.
Without load balancers, everything would be tied to everything else. The user would have to communicate with a specific instance in the web tier. If this failed or scaled, then the user’s experience would be disrupted. The instance the user is connected to would itself connect to a specific instance in the application tier, and if that instance failed or scaled, then the user’s experience would also be disrupted. In order to improve this architecture, we could place load balancers between the application tiers to abstract one tier from another.
How this changes things is that the user actually communicates with an ELB node, and the ELB node sends this connection through to a particular web server. The end user has no knowledge of which server they’re actually connected to because they are communicating via a load balancer. If instances were added or removed the user would be unaware of this fact because they’re abstracted away from the physical infrastructure by the load balancer.
The web instance the end user is using would need to communicate with an instance of the application tier, and it would do this via an internal load balancer. This represents and abstraction of communication. The web instance the end user is connected to would not be aware of the physical deployment of the application tier. It would not be aware of how many instances exist nor which one it’s actually communicating with. To complete this architecture, the application server that’s being used would use the database tier for any persistent data storage needs.
Without using load balancers with this architecture, all the tiers are tightly coupled together; they need an awareness of each other. The end user would be connecting to a specific instance in the web tier, this would be connecting to a specific instance in the application tier, and all of these tiers would need to have an awareness of each other. Load balancers remove some of this coupling. They loosen the coupling. This allows the tiers to operate independently of each other because of this abstraction. Crucially, it allows the tiers to scale independently of each other.
For instance, using our example, if the load on the application tier increased beyond the ability of two instances to service that load, then the application tier could grow independently of anything else. The web tier could continue using it with no disruption or reconfiguration because it’s abstracted away from the physical layout of this tier. Because it’s communicating via a load balancer, it has no awareness of what’s happening within the application tier.
After this we looked at cross-zone load balancing. As an example, a user was browing a particular website, one which was using a load balancer. Using their device, the user browses to the DNS name for the application, which is actually the DNS name of the load balancer. We now know that a load balancer has at least one node per availability zone that it’s configured for, and the DNS name for the load balancer will direct any incoming requests equally across all of the nodes of the load balancer. However much of the load gets directed at the load balancer DNS name, a certain percentage will be distributed to each node in each AZ.
Originally, load balancers were restricted in terms of how they could distribute the connections that they received. Initially the way that it worked was that each load balancer could only distribute connections to instances within the same AZ. Because of this historic limitation on how load balancers distribute traffic, this could lead to a substantially imbalanced distribution of load across AZ’s if for instance one AZ had four EC2 instances and another AZ had only one.
The fix for this was a feature known as cross-zone load balancing. It’s name gives away what it does. It simply allows every load balancer node to distribute any connections that it receives equally across all registered instances in all availability zones. This represents a much more even distribution of incoming load. This is known as cross-zone load balancing, the ability to distribute or load balance across availability zones. This feature was originally not enabled by default. If you’re deploying an application load balancer, this comes enabled as standard,
– ELB is a DNS A record point at 1+ Nodes per AZ: one subnet in each of those availability zones. This means one elastic load balancer node in one subnet in each availability zone that that load balancer is configured in. You’re also creating a DNS record for that load balancer, which spreads the incoming requests over all of the active nodes for that load balancer.
– Nodes (in one subnet per AZ) can scale: You start with a certain number of nodes, maybe one node per availability zone, but it will scale automatically if additional load is placed on that load balancer. By default, cross-zone load balancing means that nodes can distribute requests across to other availability zones, but historically this was disabled, meaning connections, potentially, would be relatively imbalanced. But for application load balancers cross-zone load balancing is enabled by default.
– Load balancers come in two types:
– Internet facing means nodes have public IPv4 IP’s
– Internal is private only IP’s
– EC2 doesn’t need to be public to work with a LB: EC2 instances don’t need public IP addressing to work with an internet-facing load balancer. An internet-facing load balancer has public IP addresses on its nodes, it can accept connections from the public internet and balance these across both public and private EC2 instances.
– Listener configuration controls what the LB does: Load balancers are configured via Listener Configuration, which, as the name suggests, controls what those load balancers listen to.
– 8+ Free IP’s per subnet, and /27 subnet to allow scaling: They require eight or more free IP addresses per subnet that they get deployed into. Strictly speaking, this means that a /28 would be enough, but the AWS documentation suggests a /27 in order to allow scaling.