Overview
This study guide covers the core architectural concepts tested on the AWS Cloud Practitioner exam, including the Well-Architected Framework's six pillars, fundamental cloud design principles, disaster recovery strategies, and high availability patterns. Mastering these topics will help you understand how AWS recommends building reliable, secure, cost-effective systems — and why those recommendations exist.
---
Well-Architected Framework
What It Is
The AWS Well-Architected Framework is a set of best practices and guiding principles for evaluating and improving cloud architectures. It is organized into six pillars, each addressing a distinct dimension of a well-designed system.
The Six Pillars
| Pillar | Core Focus |
|---|---|
| Operational Excellence | Running and monitoring systems to deliver business value |
| Security | Protecting data, systems, and assets |
| Reliability | Recovering from failures and meeting demand |
| Performance Efficiency | Using resources efficiently as needs evolve |
| Cost Optimization | Avoiding unnecessary spend |
| Sustainability | Minimizing environmental impact |
Pillar Deep Dives
#### Operational Excellence
• Focuses on automating operations, responding to events, and continuously improving processes.
#### Security
• Recommends a strong identity foundation and the principle of least privilege — users and services get only the permissions they absolutely need.
• Requires protecting data in transit and at rest.
• Key concepts: IAM, encryption, MFA, VPC controls.
#### Reliability
• Focuses on fault tolerance, automatic recovery, and the ability to dynamically acquire resources to meet demand.
• Key concepts: Multi-AZ deployments, Auto Scaling, backups, health checks.
#### Performance Efficiency
• Goal: use the right type and size of resources to meet system requirements efficiently.
• Efficiency must be maintained as demand changes and technology evolves.
• Key concepts: right-sizing, serverless, caching, selecting optimal compute.
#### Cost Optimization
• Goal: avoid unnecessary costs through right-sizing, eliminating idle resources, and choosing appropriate pricing models.
• Key pricing models: Reserved Instances (predictable workloads), Spot Instances (flexible/interruptible workloads).
#### Sustainability (Added 2021)
• Focuses on reducing energy consumption and carbon footprint.
• Strategies: maximize utilization, use managed/serverless services, select efficient regions.
Key Terms
• Principle of Least Privilege – Grant only the minimum permissions required
• Right-sizing – Matching instance/resource type and size to actual workload needs
• AWS Well-Architected Tool – Console-based tool to assess workloads against the six pillars
Watch Out For
> ⚠️ Exam questions may ask you to match a scenario to the correct pillar. Remember:
> - Cost = avoid waste, right-size, use savings plans
> - Reliability = recover from failure, handle demand spikes
> - Security = least privilege, encryption, identity
> - Sustainability is the newest pillar (2021) — don't forget it exists!
---
Design Principles
Core AWS Architectural Design Principles
#### Loose Coupling
• Definition: Design systems so that components have minimal dependencies on each other.
• A failure in one component is isolated and does not cascade to others.
• Example: Use SQS queues between services so a slow consumer doesn't block a producer.
#### Design for Failure
• Definition: Assume any component can fail at any time.
• Architect systems to automatically detect, respond to, and recover from failures without downtime.
• "Everything fails all the time" — Werner Vogels, Amazon CTO.
#### High Availability (HA) vs. Fault Tolerance
| Concept | Definition | Downtime? |
|---|---|---|
| High Availability | Quickly recovers from failure; minimizes downtime | Brief interruption possible |
| Fault Tolerance | Continues operating without interruption even when a component fails | Zero downtime |
> HA = fast recovery. Fault Tolerance = no interruption at all.
#### Stateless Design
• Definition: Store session/state data externally (e.g., in ElastiCache or DynamoDB) rather than on individual instances.
• Any instance can handle any request → enables horizontal scaling.
• Stateless architectures are easier to scale and more resilient.
#### Elasticity
• Definition: The ability to automatically scale resources up or down based on demand.
• You pay only for what you use — a core cloud advantage over on-premises.
• Enabled by: Auto Scaling, serverless compute, managed databases.
#### Remove Undifferentiated Heavy Lifting
• Definition: Offload tasks like patching, scaling, backups, and infrastructure management to AWS managed services.
• Lets development teams focus on business value instead of operational overhead.
• Examples: Use RDS instead of self-managed databases; use Lambda instead of managing EC2 servers.
Key Terms
• Loose Coupling – Minimizing dependencies between components
• Stateless Architecture – No instance stores session state locally
• Elasticity – Auto-scaling to match demand in real time
• Horizontal Scaling – Adding more instances (vs. vertical = bigger instances)
• Undifferentiated Heavy Lifting – Generic operational tasks AWS handles for you
Watch Out For
> ⚠️ Don't confuse elasticity (automatic scaling with demand) with scalability (the ability to scale, which may be manual). Elasticity is dynamic and automatic.
> ⚠️ Stateless ≠ Stateful. If an instance stores user session data locally, it's stateful — this breaks horizontal scaling because users must always hit the same server.
---
Disaster Recovery
Key Metrics
| Term | Full Name | Measures |
|---|---|---|
| RPO | Recovery Point Objective | Max acceptable data loss (measured in time) |
| RTO | Recovery Time Objective | Max acceptable downtime before system is restored |
> Memory Tip: RPO = how far back you can go (Point in time). RTO = how long until you're back (Time to recover).
The Four DR Strategies (Lowest → Highest Cost)
```
Backup & Restore → Pilot Light → Warm Standby → Multi-Site Active/Active
[Cheapest, highest RTO/RPO] [Most expensive, lowest RTO/RPO]
```
#### 1. Backup & Restore
• Data is backed up regularly; infrastructure is rebuilt from scratch during a disaster.
• Lowest cost, highest RTO/RPO.
#### 2. Pilot Light
• A minimal version of the environment runs continuously in a secondary region.
• Only core systems (e.g., database replica) are active — like a pilot light on a furnace.
• Scale up remaining infrastructure quickly when disaster strikes.
#### 3. Warm Standby
• A scaled-down but fully functional version of the environment runs in another region.
• Faster recovery than Pilot Light; more expensive.
#### 4. Multi-Site Active/Active (formerly Hot Standby)
• Full infrastructure runs simultaneously in multiple regions.
• Near-zero RTO and RPO — traffic is actively served from all sites.
• Highest cost.
Key Terms
• RPO (Recovery Point Objective) – Maximum tolerable data loss in time
• RTO (Recovery Time Objective) – Maximum tolerable downtime
• Pilot Light – Minimal core systems always running, ready to scale
• Multi-Site Active/Active – Full redundancy across regions, lowest RTO/RPO
Watch Out For
> ⚠️ The exam often asks which strategy has the lowest RTO/RPO (Multi-Site Active/Active) or which is least expensive (Backup & Restore).
> ⚠️ Pilot Light ≠ Warm Standby. Pilot Light has only core components running; Warm Standby has a full but scaled-down environment.
---
High Availability & Resilience
AWS Global Infrastructure Basics
#### Availability Zones (AZs)
• Definition: One or more discrete data centers with redundant power, networking, and connectivity within an AWS Region.
• AZs are physically separated and isolated from each other's failures.
• Connected by high-speed, low-latency networking.
#### Why Deploy Across Multiple AZs?
• If one AZ goes down, workloads continue in remaining AZs.
• Achieves both high availability and fault tolerance.
• AWS best practice: always use at least two AZs.
Key AWS Services for High Availability
#### Elastic Load Balancing (ELB)
• Purpose: Distributes incoming traffic across multiple healthy targets in multiple AZs.
• Automatically stops routing to unhealthy targets (health checks).
• Types: Application Load Balancer (ALB), Network Load Balancer (NLB), Gateway Load Balancer.
#### Auto Scaling Groups
• Purpose: Automatically launches or terminates EC2 instances based on:
- Demand (scale out when CPU is high, scale in when low)
- Health checks (replace unhealthy instances automatically)
• Maintains availability while optimizing cost.
#### Amazon Route 53
• AWS's DNS and traffic routing service.
• Supports routing policies for high availability:
- Failover routing – Route to backup endpoint if primary is unhealthy
- Latency-based routing – Route users to the lowest-latency region
• Enables global availability across multiple AWS Regions.
High Availability Architecture Pattern
```
Internet
↓
Route 53 (DNS + Failover Routing)
↓
Elastic Load Balancer
↓ ↓
[AZ-1] [AZ-2]
EC2 Instance EC2 Instance
↕ ↕
Multi-AZ RDS Database
```
Key Terms
• Availability Zone (AZ) – Isolated data center(s) within a Region
• Elastic Load Balancing (ELB) – Distributes traffic across healthy targets
• Auto Scaling Group – Automatically adjusts EC2 fleet size
• Amazon Route 53 – DNS with intelligent traffic routing
• Health Check – Automated probe to verify a resource is operational
Watch Out For
> ⚠️ Regions vs. AZs: A Region is a geographic area with multiple AZs. Deploying across AZs protects against data center failures. Deploying across Regions protects against regional outages.
> ⚠️ ELB alone is not enough for HA — you need instances in multiple AZs for ELB to route around failures.
> ⚠️ Auto Scaling maintains availability and controls cost — it can scale in (terminate instances) as well as scale out (launch instances).
---
Quick Review Checklist
Well-Architected Framework
• [ ] Name all six pillars in order: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, Sustainability
• [ ] Know which pillar = least privilege (Security)
• [ ] Know which pillar = recover from failure (Reliability)
• [ ] Know which pillar = avoid waste (Cost Optimization)
• [ ] Know the Sustainability pillar was added in 2021
• [ ] Know the AWS Well-Architected Tool assesses workloads in the console
Design Principles
• [ ] Define loose coupling and why it prevents cascading failures
• [ ] Define design for failure — assume any component can fail
• [ ] Distinguish high availability (fast recovery) from fault tolerance (no interruption)
• [ ] Explain stateless design and why it enables horizontal scaling
• [ ] Define elasticity as automatic, demand-driven scaling
• [ ] Understand "remove undifferentiated heavy lifting" = use managed services
Disaster Recovery
• [ ] Define RPO (data loss tolerance) vs. RTO (downtime tolerance)
• [ ] Rank the four DR strategies from cheapest/slowest to most expensive/fastest: Backup & Restore → Pilot Light → Warm Standby → Multi-Site Active/Active
• [ ] Know Pilot Light = only core components running (e.g., DB replica)
• [ ] Know Multi-Site Active/Active = near-zero RTO and RPO, highest cost
High Availability & Resilience
• [ ] Define an Availability Zone as isolated data center(s) within a Region
• [ ] Know why multi-AZ deployments improve HA and fault tolerance
• [ ] Know ELB routes traffic to healthy targets across AZs
• [ ] Know Auto Scaling Groups replace unhealthy instances and adjust fleet size
• [ ] Know Route 53 provides failover and latency-based routing across Regions
---
Focus your final review on scenario-based questions: given a business requirement (e.g., "zero data loss" or "minimize cost"), identify the correct DR strategy, design principle, or Well-Architected pillar.