High Availability and Global Infrastructure on AWS
This content is from the lesson "2.2.1 High Availability and Global Infrastructure" in our comprehensive course.
View full course: AWS Solutions Architect Associate Study Notes
High Availability and Global Infrastructure are fundamental components for building resilient applications on AWS.
This lesson covers AWS global infrastructure, high availability principles, failover strategies, distributed design patterns, and the use of AWS managed services for highly available architectures.
____
How It Works & Core Attributes:
AWS Global Infrastructure:

Infrastructure Components:
- What AWS Global Infrastructure is: AWS global infrastructure consists of regions, availability zones, and edge locations distributed worldwide. This infrastructure provides the foundation for building highly available and fault-tolerant applications
- AWS Regions: Geographic areas where AWS has data centers. Each region is completely independent and isolated from other regions. Regions are designed to be completely independent to provide fault tolerance and stability
- Availability Zones (AZs): One or more discrete data centers within a region, each with redundant power, networking, and connectivity. AZs are connected to each other with high-bandwidth, low-latency networking
- Edge Locations: Points of presence (PoPs) that cache content closer to users. Edge locations are used by CloudFront to deliver content with low latency and high transfer speeds
Infrastructure Benefits:
- Global Infrastructure Benefits: AWS global infrastructure provides high availability, fault tolerance, and low latency. You can deploy applications across multiple regions and AZs to ensure they remain available even if one region or AZ fails
__
High Availability Principles:
Availability Fundamentals:
- What High Availability is: High availability refers to the ability of a system to remain operational and accessible for a high percentage of time. High availability is achieved through redundancy, fault tolerance, and automatic failover mechanisms
- Redundancy: Having multiple copies of critical components to ensure that if one fails, others can take over. This includes redundant servers, databases, and network connections
- Fault Tolerance: The ability of a system to continue operating even when some components fail. Fault-tolerant systems can handle failures gracefully without service interruption
Availability Mechanisms:
- Automatic Failover: The process of automatically switching from a failed component to a working backup component. Automatic failover ensures that users don't experience service interruption during failures
- Health Checks: Regular monitoring of system components to detect failures quickly. Health checks can trigger automatic failover when they detect that a component is not responding properly
__
Failover Strategies:
Failover Types:
- What Failover is: Failover is the process of switching from a failed component to a backup component. Failover strategies determine how and when this switching occurs
- Active-Passive Failover: One component (active) handles all traffic while another component (passive) waits in standby. If the active component fails, the passive component takes over
- Active-Active Failover: Multiple components handle traffic simultaneously. If one component fails, the remaining components continue to handle traffic
Failover Implementation:
- DNS Failover: Using DNS to route traffic to healthy endpoints. If one endpoint fails, DNS can be updated to route traffic to healthy endpoints
- Application-Level Failover: Applications detect failures and switch to healthy components. This can include database failover, load balancer failover, and service failover
__
Distributed Design Patterns:
Design Patterns:
- What Distributed Design is: Distributed design involves spreading application components across multiple locations to improve availability, performance, and fault tolerance
- Microservices Architecture: Breaking applications into small, independent services that can be deployed and scaled independently. Microservices improve fault tolerance by isolating failures to individual services
- Event-Driven Architecture: Using events to communicate between services. Event-driven architectures are loosely coupled and can handle failures gracefully
Fault Tolerance Patterns:
- Circuit Breaker Pattern: A design pattern that prevents cascading failures by temporarily stopping requests to a failing service. Circuit breakers can automatically recover when the service becomes healthy again
- Bulkhead Pattern: Isolating resources so that a failure in one part of the system doesn't affect other parts. This can include separate database connections, thread pools, and service instances
__
AWS Managed Services for High Availability:
Service Categories:
- What AWS Managed Services are: AWS managed services handle the undifferentiated heavy lifting of infrastructure management, including high availability features. These services automatically provide redundancy and failover capabilities
- Amazon Comprehend: A natural language processing service that can analyze text for sentiment, entities, and key phrases. Comprehend is highly available and can be used across multiple regions
- Amazon Polly: A text-to-speech service that converts text into natural-sounding speech. Polly provides high availability and can be used for applications that need speech synthesis
Database and Cache Services:
- RDS Multi-AZ: Amazon RDS Multi-AZ deployments provide high availability by maintaining a standby replica in a different Availability Zone. The standby replica is automatically promoted to primary if the primary fails
- ElastiCache Multi-AZ: ElastiCache can be configured with Multi-AZ for high availability. If the primary node fails, ElastiCache automatically fails over to the replica
__
Basic Networking Concepts:
Networking Fundamentals:
- What Basic Networking is: Understanding fundamental networking concepts is essential for designing highly available architectures. This includes routing, load balancing, and network segmentation
- Route Tables: Control how traffic is routed within your VPC and to external networks. Route tables can be configured to route traffic to healthy endpoints and avoid failed components
- VPC Design: Virtual Private Cloud design affects high availability. VPCs should span multiple AZs and include proper subnet design for high availability
Network Services:
- Load Balancing: Load balancers distribute traffic across multiple targets and can perform health checks to route traffic only to healthy targets
- Network Segmentation: Dividing networks into smaller segments to isolate failures and improve security. Network segmentation can prevent failures from affecting the entire system
____
Analogy: A Global Airline Network
Imagine you're managing a worldwide airline network that needs to serve passengers reliably across multiple continents.
AWS Regions: Your major hub airports in different continents that operate independently. Each hub serves its local region and connects to other hubs for global coverage.
Availability Zones: Your multiple runways and terminals at each hub airport. If one runway is closed, you can still operate using the others without disruption.
Edge Locations: Your smaller regional airports that serve local communities. These provide faster access for passengers who don't need to travel to major hubs.
Failover Strategies: Your backup planes and crews ready to take over if the primary ones fail. The system automatically switches to backups when needed.
Distributed Design: Your multiple routes between cities so if one route is blocked, others are available. This ensures passengers can always reach their destinations.
Health Checks: Your regular maintenance checks on aircraft to ensure they're safe to fly. These prevent problems before they cause service disruptions.
____
Common Applications:
- Global Web Applications: Applications that need to serve users worldwide with low latency and high availability
- E-commerce Platforms: Online stores that need to remain available 24/7 to avoid losing sales
- Financial Services: Banking and payment applications that require high availability and fault tolerance
- Healthcare Systems: Medical applications that need to remain available for patient care
____
Quick Note: The "High Availability Foundation"
- Design applications to span multiple Availability Zones for fault tolerance
- Use AWS managed services that provide built-in high availability features
- Implement automatic failover mechanisms to handle component failures
- Monitor system health and set up alerts for potential issues
- Test failover scenarios regularly to ensure they work as expected
TAGS
Want to learn more?
Check out these related courses to dive deeper into this topic



