System Outage Communication Process
Purpose
Effective communication during system outages is essential for managing stakeholder expectations, maintaining client trust, and ensuring that business continuity plans are activated in a timely manner. This document defines the process for communicating both planned and unplanned IT system outages to all affected parties within Global Bank.
Policy Reference: IT-INC-004
Applies To: IT Operations, IT Service Desk, Communications Team, all affected stakeholders
Types of Outages
| Type | Description | Lead Time Required |
|---|---|---|
| Planned Maintenance | Scheduled maintenance windows for patching, upgrades, or infrastructure changes | Minimum 5 business days advance notice |
| Emergency Change | Urgent unscheduled change required to address a critical vulnerability or imminent risk | Minimum 2 hours advance notice (where possible) |
| Unplanned Outage | Unexpected system failure or service disruption | Notification within 15 minutes of detection |
Communication Channels
Outage notifications are distributed through multiple channels to ensure maximum reach:
| Channel | Used For | Managed By |
|---|---|---|
| IT Status Page | All outages — real-time status updates | IT Operations |
| Email (mass notification) | Planned maintenance and major unplanned outages | IT Service Desk |
| Microsoft Teams — #it-status channel | Real-time updates for all outage types | IT Operations |
| SMS (emergency) | P1 outages affecting client-facing systems | IT Operations (via PagerDuty) |
| Desktop notification (banner) | Major outages affecting all users | IT Operations (via endpoint management) |
| Intranet banner | Planned maintenance announcements | Communications Team |
Planned Maintenance Process
Step 1: Change Request
All planned maintenance must be approved through the Change Advisory Board (CAB) before any communication is issued. The change owner submits a Request for Change (RFC) through the IT Service Portal at least 10 business days before the proposed maintenance window.
Step 2: Communication Planning
Once the RFC is approved, the change owner completes the Outage Communication Template (available on the IT Service Portal) with the following information:
- System(s) affected
- Date and time of maintenance window (including timezone)
- Expected duration
- Impact description (what users will experience)
- Workarounds available during the outage
- Rollback plan summary
- Contact point for queries
Step 3: Notification Schedule
| Timing | Action |
|---|---|
| 5 business days before | Initial notification via email and intranet banner |
| 1 business day before | Reminder notification via email and Teams |
| 1 hour before | Final reminder via Teams and IT Status Page |
| Start of maintenance | Status Page updated to "In Progress" |
| Completion | All-clear notification via all channels; Status Page updated to "Operational" |
Unplanned Outage Process
Step 1: Detection and Assessment
When an unplanned outage is detected (via monitoring alerts, user reports, or SOC notification), the IT Operations duty manager assesses the impact and assigns a priority level in accordance with the IT Incident Reporting Procedure (IT-INC-001).
Step 2: Initial Notification (within 15 minutes)
For P1 and P2 incidents, the IT Service Desk issues an initial notification within 15 minutes of detection. The notification includes:
- Affected system(s)
- Nature of the issue (to the extent known)
- Current impact
- Estimated time to resolution (if known) or next update time
Step 3: Ongoing Updates
| Priority | Update Frequency | Channels |
|---|---|---|
| P1 — Critical | Every 30 minutes | Status Page, Teams, Email, SMS |
| P2 — High | Every 60 minutes | Status Page, Teams, Email |
| P3 — Medium | Every 4 hours | Status Page, Teams |
Step 4: Resolution and All-Clear
Once the issue is resolved:
- The IT Status Page is updated to "Operational" with a summary of the issue and resolution.
- An all-clear email is sent to all previously notified stakeholders.
- A brief post is made in the Teams #it-status channel confirming resolution.
- For P1 incidents, a preliminary Root Cause Analysis (RCA) summary is distributed within 24 hours, with a full RCA report following within 5 business days.
IT Status Page
The IT Status Page is the authoritative source for real-time information on system availability. It is accessible at status.globalbank.com and displays the current status of all major systems and services. Employees are encouraged to check the Status Page before contacting the Service Desk during a suspected outage.
Roles and Responsibilities
| Role | Responsibility |
|---|---|
| Change Owner | Prepares communication content for planned maintenance |
| IT Operations Duty Manager | Authorises and coordinates unplanned outage communications |
| IT Service Desk | Distributes notifications via email and manages inbound queries |
| Communications Team | Manages intranet content and supports external client communications where required |
| Senior Management | Receives executive briefings for P1 incidents and approves external client notifications |
Contact
- IT Status Page: status.globalbank.com
- IT Operations: itops@globalbank.com | Ext. 2100
- IT Service Desk: servicedesk@globalbank.com | Ext. 2000