Question 1

Where do I check if it's actually a Microsoft outage and not a tenant issue?

Accepted Answer

Three places, in order: (1) Service Health Dashboard inside the M365 Admin Centre — Microsoft's authoritative incident feed for YOUR tenant, including incidents that only affect specific tenants. (2) status.office.com — the public summary, useful for customer-facing comms. (3) Twitter/X (@MSFT365Status) for fast independent confirmation. The Service Health RSS feed lets you wire alerts into Teams or your ITSM tool. If only YOUR users are affected and Service Health shows nothing — likely a tenant config issue, not a Microsoft outage.

Question 2

What's the right comms cadence during an outage?

Accepted Answer

First message within 15 minutes of confirming the outage — even if you have no info, just acknowledge. Then update every 30-60 minutes regardless of whether you have new info. Going silent is the worst thing — your helpdesk inbox fills up and trust erodes. Keep messages short: what's broken / what we know / what we're doing / when next update. Use ALL channels (Teams banner, intranet, email, phone tree if mass-impact). Most orgs under-communicate; few over-communicate.

Question 3

When do I escalate to Microsoft and how?

Accepted Answer

Escalate immediately if: (a) Microsoft hasn't acknowledged after 30+ minutes of clear customer impact, (b) the impact is business-critical, or (c) you need an ETA for executive/customer comms. Routes: Premier/Unified support ticket with appropriate severity, your account team / Customer Success Manager, or the Cloud Solution Architect (CSA) if you have a relationship. Tag the official Microsoft incident ID in your ticket. Don't just open Sev-A and wait — call the support number AND open the ticket for fastest response.

Question 4

What's a PIR and should I demand one?

Accepted Answer

Post-Incident Review — Microsoft's written analysis of what happened, root cause, and what they're changing to prevent recurrence. Available 5-10 business days after major incidents in the Service Health Dashboard. Yes, demand it for any meaningful impact — review with your team, update your runbook with lessons, and use it to improve YOUR detection / communication for next time. Don't just file the PIR — close the loop.

M365 Service Outage Runbook — Free Mind Map

Frequently Asked Questions