P1 Major Incident Process – Summary On A Page


Any Incident where impact is extreme should be classed as a P1, this could be a whole country being down or critical team e.g., Traders, Finance unable to work.  


A P1 SLA is 45 minutes. All Major incidents must be progressed by all teams via telephone until the service is resolved.  Due to the nature of the B2B Business, it is no longer acceptable to wait for a response by email.

  1. Service Desk to ensure that the Impact and Urgency of the Incident is clearly understood when the ticket is initially raised. All relevant information is captured, and any additional information is added to the ticket e.g., screenshots, errors visible. 
  2. Service Desk assigns the ticket to the relevant 2nd level support team and verbally confirms with the 2nd line team that ownership is acknowledged.  If no owner escalates to Team Manager
  3. Service Desk sends communications to all affected parties via the MI portal (Target - 10 minutes from ticket being raised) 
  4. 2nd Level support investigation and all actions updated in the ticket for visibility of the steps already taken. If information is required contact the end user directly via phone or inform Service Desk of what questions to ask.
  5. 2nd Level support provide an update to Service Desk of findings and next steps e.g., if the incident is understood and additional time is required for an offer feed to be restarted this should be stated in the communications. (Target - 25 minutes of the ticket being owned)
  6. Service Desk 2nd communications to be sent via the MI portal with areas findings and next steps, if root cause is known at this stage the customer expectations should be set for time to resolve. (in English wording not technical terminology) (Target - 35 minutes of ticket being raised) 
  7. 2nd Level support verbally inform Service Desk the incident is believed to be resolved.
  8. Service Desk contacts the person who reported the Major Incident via phone to confirm service is restored
  9. Service Desk 3rd communications to be sent out via the MI portal confirming incident is resolved. (Target 45 minutes of ticket being raised)


Examples of Priorities and Definitions


Priority

Description

Time (SLA)

Examples

Definition

Response

Resolution

1

Critical / Major

Immediate

<=45 mins * (working hours)

Live, Regular or Virtual Sports Betting unavailable,
Live Screen Prices not updating,
Traders unable to authorise Live Bets,
Studio unable to Publish to all Countries,
 Content not loading on the Retail or Online Website, 

Retail Website Unavailable
Result Processor...
Exchange Server unavailable,
Citrix Access Unavailable for multiple users
Network Outage.
Loss of Connection to Betradar
 Multiple User Application =100% of team

High Profile Events Missing or Prices not Updating

Duplicated Bets being accepted

Greek Keno Unavailable

Inspired Virtual Sports Unavailable

An Incident affecting a critical Business Service has occurred resulting in the inability to perform key functions of the service and / or key business operational processes.  
 

All users in one or more countries are impacted and the incident requires immediate attention.


 Impact has occurred due to an unplanned interruption to or significant reduction in the quality of a Key Service, i.e. severe degradation in performance


End Customers unable to perform key Business Functions. Significant financial impact.

 Damage to business reputation likely to be high

2

Severe Impact

Immediate

2 hours* (working hours)

Studio Unable to publish to a country Screen System (all except Italy)
Website Missing Events (Ask Traders if high profile events)
Traders Liability reports
Bet Acceptance is Slow
Multiple bets not settling (ask traders for true value of impact)
Server Hard Disk Failure (P2 if resilience is in place P1 if none)
 Multiple User Application = 50% of team

Unable to pay winnings on SSBT but can pay at the till 

Partial Loss of a Major/Critical Service
 
System usable but if the problem is not resolved promptly a business-critical situation will occur
Many Customers unable to use the system over an increasing period
 Degraded service - slow - loss of resilience

Workaround is available

3

High Priority

30 Min (working hours)

8 hours* (working hours)

Error on Screen Template
File Server Low on Disk Space
 Multiple User Application Error =<20% of team

Unable to create JIRA tasks

None Critical Log Request

Complete Non-critical Service is available, but performance is Impaired.

 

4

 

Standard Priority

 

1 Hour* (working hours)

 

12 hours* (working hours)

Paper Jam
File Restore
Ink Toner Replacement
Single User Bet Settlement
Mime-sweeper Email release
Single User Application Error
 Single User Hardware Failure

Missing Funds in Player Wallet (single user)

 

No Service Impact.
Any problem where no Service impact is being incurred and no urgent action is required.
 The problem causes inconvenience or nuisance but has little effect on the Customers

R1

Emergency Service Request

Immediate Response

Immediate Response

Emergency Deployment / Access Request

A Service Request is anything new that a user didn't previously 

An emergency request should be implemented to resolve an incident or one that has been approved by IT Director,

R2

Normal 

1 day* (business)

<=5 days* (business) 

New User Request,
New Hardware Request e.g., Mobile, Laptop, Desktop,
New Application Request e.g. Visio,
 Access Request

A Service Request is something a user / service didn't have the day before (is something new).