Domain 1 β€” Module 4 of 7 57%
4 of 26 overall
Domain 1: Describe Cloud Concepts Free ⏱ ~11 min read

High Availability and Scalability

Two of the most important cloud benefits β€” keeping your apps running when things fail, and growing your resources to match demand. Here's how Azure delivers both.

Why do availability and scalability matter?

Simple explanation

Imagine your favourite coffee shop.

High availability = the shop is open every day, even if one barista calls in sick. There’s always someone to make your coffee because they have backup staff.

Scalability = during the morning rush, the shop opens extra registers and calls in more baristas. When it’s quiet in the afternoon, they close the extra registers. They match capacity to demand.

In cloud computing, your apps need to be β€œalways open” (available) and able to β€œcall in more baristas” (scale) when traffic spikes.

High availability β€” keeping things running

When Peak Roasters launches their online ordering system, they can’t afford downtime. Every minute the ordering page is down, they lose sales.

High availability means: even if a server crashes, the app keeps running because it’s deployed across multiple servers (or even multiple data centres).

How Azure delivers high availability

MechanismWhat It DoesExample
RedundancyMultiple copies of your app across servers3 VMs behind a load balancer
Load balancingDistributes traffic across healthy instancesAzure Load Balancer
Availability zonesSeparate physical locations within a regionZone 1, Zone 2, Zone 3
Region pairsAzure matches regions for disaster recoveryAustralia East + Australia Southeast
Auto-restartFailed VMs automatically restart on healthy hardwareAzure fabric controller

SLAs β€” measuring availability

Azure measures availability using Service Level Agreements (SLAs) β€” guarantees expressed as uptime percentages:

SLADowntime Per YearDowntime Per Month
99%3.65 days7.3 hours
99.9%8.76 hours43.8 minutes
99.95%4.38 hours21.9 minutes
99.99%52.6 minutes4.38 minutes

Key exam concept: Higher SLAs require more redundancy, which costs more. A single VM might offer 99.9% SLA. Two VMs in an availability set might offer 99.95%. Two VMs across availability zones might offer 99.99%.

Exam tip: The nines matter

The exam may ask about SLA percentages and what they translate to in real downtime. Key numbers to remember:

  • 99.9% (three nines) = about 8.76 hours of downtime per year
  • 99.99% (four nines) = about 52.6 minutes of downtime per year

When you combine services, the combined SLA is lower than the individual SLAs. If Service A has 99.9% and Service B has 99.9%, together they offer 99.9% x 99.9% = 99.8% uptime.

Scalability β€” matching resources to demand

Scalability means you can add (or remove) resources based on demand. There are two types:

Vertical vs horizontal scaling
FeatureVertical Scaling (Scale Up/Down)Horizontal Scaling (Scale Out/In)
What changesSize of a single resourceNumber of resource instances
ExampleUpgrade a VM from 2 CPU/4 GB to 8 CPU/32 GBGo from 1 VM to 5 VMs behind a load balancer
AnalogyReplacing a small truck with a bigger truckAdding more trucks to the fleet
LimitMax hardware size of the machineVirtually unlimited
DowntimeUsually requires a restartNo downtime β€” new instances are added live
Best forDatabases, single-instance appsWeb apps, APIs, stateless services

Scaling in action: Summit Construction

Summit Construction’s project portal normally handles 50 users. But during quarterly reviews, 500 project managers log in simultaneously.

Without cloud: They’d need to buy servers capable of handling 500 users β€” even though they only need that capacity 4 times a year. Those servers sit idle 95% of the time.

With Azure: The portal runs on 2 VMs normally. During quarterly reviews, it automatically scales out to 10 VMs. After the review, it scales back to 2. They only pay for the extra VMs during those peak periods.

Elasticity β€” automatic scaling

Elasticity is a specific type of scalability where resources automatically increase and decrease based on demand β€” without human intervention.

ConceptDefinition
ScalabilityThe system can handle more load by adding resources
ElasticityThe system automatically adds and removes resources based on actual demand

Think of a rubber band: it stretches when pulled and snaps back when released. An elastic cloud system stretches with traffic spikes and contracts when traffic drops.

Azure services that provide elasticity:

  • Virtual Machine Scale Sets β€” automatically add/remove VMs based on CPU, memory, or custom metrics
  • Azure App Service β€” auto-scale web apps based on request count or schedule
  • Azure Functions β€” scale from zero to thousands of instances automatically
Real-world: Harbour Health during flu season

Harbour Health’s patient portal sees 10x normal traffic during flu season. With Azure’s elasticity:

  • Normal: 3 VMs, handling 500 concurrent users
  • Flu season peak: Auto-scales to 15 VMs, handling 5,000 concurrent users
  • After flu season: Automatically scales back to 3 VMs

Total extra cost: only the additional VMs during the 6-week peak period, not year-round.

🎬 Video walkthrough

Flashcards

Question

What is high availability in cloud computing?

Click or press Enter to reveal answer

Answer

The ability of a system to remain operational and accessible even when components fail. Achieved through redundancy, load balancing, availability zones, and region pairs.

Click to flip back

Question

What is the difference between vertical scaling and horizontal scaling?

Click or press Enter to reveal answer

Answer

Vertical scaling (scale up) = making a single resource bigger (more CPU, RAM). Horizontal scaling (scale out) = adding more instances of a resource. Horizontal is preferred for cloud apps because it's virtually unlimited and requires no downtime.

Click to flip back

Question

What is the difference between scalability and elasticity?

Click or press Enter to reveal answer

Answer

Scalability means the system CAN handle more load. Elasticity means the system AUTOMATICALLY adjusts resources based on demand β€” scaling up during peaks and down during quiet periods.

Click to flip back

Question

If Service A has a 99.9% SLA and Service B has a 99.9% SLA, what is the combined SLA?

Click or press Enter to reveal answer

Answer

99.8% β€” calculated by multiplying: 0.999 x 0.999 = 0.998. The combined SLA is always LOWER than the individual SLAs.

Click to flip back

Knowledge Check

Knowledge Check

Summit Construction's project portal normally serves 50 users but gets 500 users during quarterly reviews. Which scaling approach is MOST appropriate?

Knowledge Check

An Azure VM has a 99.9% SLA. What does this mean in practical terms?

Knowledge Check

Which cloud characteristic allows resources to automatically increase and decrease based on demand without manual intervention?


Next up: More cloud benefits β€” reliability, predictability, security, governance, and manageability in the cloud.