You are on page 1of 14

University of California San Francisco

Office of Academic and Administrative Information Systems (OAAIS)

OAAIS PROBLEM MANAGEMENT PROCESS


VERSION 2.1, REV. 12/11/09

UC SF

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

TABLE OF CONTENTS
TABLE OF CONTENTS ........................................................................................... 2 DOCUMENT VERSION CONTROL ........................................................................... 3 STAKEHOLDER TEAM ........................................................................................... 3 1.
1.1. 1.2. 1.3.

INTRODUCTION........................................................................................... 4
PURPOSE .................................................................................................................................................................... 4 SCOPE ......................................................................................................................................................................... 4 DEFINITIONS ............................................................................................................................................................. 4

2. 3.
3.1. 3.2.

RESPONSIBILITIES ..................................................................................... 4 PROCESS DEFINITION ................................................................................. 5


PROCESS MAP ........................................................................................................................................................... 5 ACTIVITY DIAGRAMS .................................................................................................................................................. 6

4. 5. 6. 7. 8. 9.

RACI CHART ................................................................................................ 8 ENTRY CRITERIA ......................................................................................... 9 PROCEDURE ................................................................................................ 9 EXIT CRITERIA .......................................................................................... 13 PROBLEM MANAGEMENT SERVICE LEVELS ................................................ 13 PROBLEM MANAGEMENT SERVICE METRICS .............................................. 14

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

2 of 14

Rev 12/11/09

UC SF

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

DOCUMENT VERSION CONTROL


Document Name Process Owner Version Number
1.0 2.0 2.1

OAAIS Problem Management Process Mimi Sosa

Issue Date
5/11/07 7/09/09 12/11/09

Prepared By
Terrie Coleman Francine Sneddon Francine Sneddon First draft

Reason for Change

Updated to reflect Remedy 7 changes. Updated Stakeholder Team

STAKEHOLDER TEAM
Department Name

Customer Support Services (CSS) Enterprise Information Security (EIS) Enterprise Network Services (ENS) Information Technology Services (ITS) Academic Research Systems (ARS) Application Services (AS) Business & Resource Management (BRM)

Rebecca Nguyen Michael Kamerick (Interim) Jeff Fritz Heidi Schmidt Michael Kamerick Jane Wong Shahla Raissi

This document contains confidential, proprietary information intended for internal use only and is not to be distributed outside the University of California, San Francisco (UCSF) without an appropriate non-disclosure agreement in force. Its contents may be changed at any time and create neither obligations on UCSFs part nor rights in any third person. UCSF Internal Use Only Problem Mgmt 12-11-09.doc 3 of 14 Rev 12/11/09

UC SF
1.1.

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

1. INTRODUCTION
PURPOSE
The objective of Problem Management is to resolve the underlying root cause of incidents and consequently prevent them from recurring. Reactive Problem Management aims to identify the root cause of past incidents and presents proposals for improvement or rectification. Proactive Problem Management aims to prevent incidents from recurring by identifying weaknesses in the infrastructure and making proposals to eliminate them.

1.2. SCOPE
The scope of the Problem Management process includes a standard set of processes, procedures, responsibilities and metrics that are utilized by all OAAIS services, applications, systems and network support teams.

1.3. DEFINITIONS
A problem describes an undesirable situation, indicating the unknown root cause of one or more existing or potential incidents. A known error is a problem for which the root is known and for which a temporary workaround has been identified. A Request for Change (RFC) proposes a change to eliminate a known error and is addressed by the Change Management process. The Problem Management process includes Problem Control, Error Control, Proactive Problem Management and Providing Management Reporting.

2. RESPONSIBILITIES
Problem Management Team
Reactive Problem Management Identifies and records problems by analyzing incident details Approves problem resolution recommendations and establishes resolution priority Generates Requests for Change (RFC) Identifies trends and records problems Approves problem resolution recommendations and establishes resolution priority

Proactive Problem Management

Problem Owner
Investigates and manages problems based on their priority Assigns (or obtains) resources and manages error control activities Schedules and facilitates major problem reviews Develops recommendations for problem resolution Monitors the progress of known errors Generates Requests for Change (RFC)

Problem Manager
Coordinates and guides activities of the Problem Management Team and Problem Owner(s) Provides management information and uses it proactively to prevent the occurrence of incidents and problems in both production and development environments Escalates the analysis and resolution of cross-functional problems to Unit and OAAIS levels Conducts post mortem or Post-Implementation Reviews (PIR) for continuous improvement Develops and improves Problem Control and Error Control procedures

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

4 of 14

Rev 12/11/09

UC SF

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

3. PROCESS DEFINITION
3.1. PROCESS MAP

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

5 of 14

Rev 12/11/09

UC SF

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

3.2. ACTIVITY DIAGRAMS

OAAIS Problem Management Process v1.0


Conduct Problem Control Conduct Error Control

Problem Management Team

START

From Step 29 Yes 2 Identify High Occurrence of Similar Incidents 4 Match Incidents & Create Problem(s) 6 Assign Problem Owner(s)

Go to Step 16

1 Categorize Incidents

3 Identify Problem(s)

5 Review Problem(s)

14 Approve ? No 13 Present Solution Recommendation

Yes

15 Fix? No Go to Step 25

Problem Manager

7 Accept Problem Assignment

Problem Owner(s)

From Steps 23/24

8 Assess Problem

9 Investigate & Diagnose Problem

10 Determine Root Cause

11 Document the Known Error

12 Document Possible Solution(s)

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

6 of 14

Rev 12/11/09

UC SF

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

7 of 14

Rev 12/11/09

UC SF

Office of Academic and Administrative Information Systems

Problem Management Process

4. RACI CHART
Te am em en t er an ag M P ro bl em

an ag

em

Problem Management Step Description Step # Conduct Problem Control Categorize Incidents R A 1 Identify High Occurrence of Similar Incidents R A 2 Identify Problems R A 3 Match Incidents & Create Problems R 4 A Review Problems R A 5 Assign Problem Owners R 6 A Accept Problem Assignment R A 7 Conduct Error Control Assess Problem A/R 8 Investigate & Diagnose Problem A/R 9 Determine Root Cause A/R 10 Document the Error A/R 11 Document Possible Solutions A/R 12 Present Solution Recommendation I I A/R 13 Approve? R A/R 14 Fix? R 15 A/R Determine Priority R A/R 16 Schedule R A/R 17 Monitor the Error 18 I A/R RFC Required? A/R 19 Create Change Request A/R 20 Perform Change Management Process - Hand-off Implement Solution 21 A/R Monitor Resolution 22 A/R Error Resolved? 23 A/R Permanent Solution? 24 A/R Update & Close Problem 25 A/R Conduct Review 26 A/R Proactive Problem Management Collect, Review & Analyze Data - incident, problem, known errors, industry, performance management 27 , security, and user data A/R Identify Infrastructure Issues - weak, overloaded, 28 A/R vulnerable components Create Problem & Review Problem 29 A/R Provide Reporting Generate Updated Problem Report 30 A/R Distribute Problem Reports, as required 31 A/R
P ro bl P ro bl
Responsible People who do the work, facilitate it and/or organize it

em

O w

n er

Output

Accountable The one who ensures that desired outcomes are reached and has yes/no decision making authority Consulted People who have critical expertise to contribute before a decision is made Informed People who are significantly affected by the activity/decision and must be informed to ensure successful implementation

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

8 of 14

Rev 12/11/09

UC SF

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

5. ENTRY CRITERIA
Incident details, including workarounds and RFCs Configuration details (from Configuration Management Database future) Product information including technical details and known errors Details about infrastructure behavior, capacity, performance and service levels

6. PROCEDURE
ID Conduct Problem Control 1 Categorize Incidents Team receives an Incident Report from Remedy showing open and closed incidents. Incidents will be categorized by category, type and item. The team reviews the report and assigns a problem category to each incident, such as: 2 Application Functional Geography Database Type Version Operating System Hardware Network Problem Management Team Problem Management Team Step Responsibility

Identify High Occurrence of Similar Incidents Using best judgment and experience, review the categorized incidents looking for similarities and/or high occurrence of the same incident.

Identify Problem(s) Identify problem(s) based on high occurrence, critical issues, trends, threatened service levels and / or incidents not linked to an existing problem or known error.

Problem Management Team

Match Incidents & Create Problems Link new incidents to a problem, by entering the incident # into the problem worklog. If an incident cannot be linked to an existing problem the team will create a new problem in Remedy with a link to the incident. The following fields are required: Assignment Group Summary Description Requester Create Date Assign To Planned End Date Status Case Type = Problem

Problem Management Team

Review Problem(s) A Problem Report is generated by Remedy, reviewed and the team: Consolidates similar problems Creates new problems, if needed Prioritizes problems Defines investigation scope Establishes target resolution dates

Problem Management Team

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

9 of 14

Rev 12/11/09

UC SF
ID 6

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

Step Assign Problem Owner(s) Assign one or more owners to work a problem based on: Availability Expertise Complexity of the problem

Responsibility Problem Manager

Accept Problem Assignment Team and Problem Owner(s) agree on the scope and plan for the assignment.

Problem Owner(s)

Conduct Error Control 8 Assess Problem Review and document what is known about the problem, including: 9 Incident details User errors Intermittent errors Application logs Server logs Workarounds Problem Owner(s) Problem Owner(s)

Investigate and Diagnose Problem Investigate and diagnosis the problem. Possible resources include: Internet search - Google error code Establish trouble tickets with vendors Discuss with other functional areas Duplicate the problem Conduct isolation analysis Forums/user groups

10

Determine Root Cause Conduct a root cause analysis of the problem. Possible tools and methods include: Five Whys Fishbone Diagram Ishikawa Diagram Pareto Analysis Statistical analysis

Problem Owner(s)

11

Document the Known Error Update Remedy with information regarding the Known Error, including: Error description Error code Impact/severity/priority Root cause

Problem Owner(s)

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

10 of 14

Rev 12/11/09

UC SF
ID 12

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

Step Document Possible Solution(s) Analyze, compare and evaluate alternatives including permanent solutions, temporary solutions or not to fix. Possible resources include: Vendor troubleshoot guides and documents Internet searches Vendor trouble tickets Identified workarounds and fixes User group recommendations

Responsibility Problem Owner(s)

Draft the following documents based on the severity and type of problem: 13 Cost benefit analysis Risk analysis Impact analysis Criteria for success Problem Owner(s)

Present Solution Recommendation Problem Owner(s) presents recommendation(s) for solving the problem to the team. The following items may be included: Cost benefit analysis Risk analysis Impact analysis Criteria for success

14

Approve? Evaluate and approve the recommendation(s) using the following criteria: Objective analysis of problem Strategic direction Severity/visibility of problem Cost benefit analysis

Problem Management Team

Yes go to Step 15 / No go to Step 12 15 Fix? Yes go to Step 16/ No go to Step 25 16 Determine Priority Set the priority based on: 17 Impact on the business Urgency of the problem Risk assessment Resource constraints Problem Management Team Problem Management Team Problem Management Team

Schedule? Estimate level of effort and schedule the work to be completed or determine that the work cannot be scheduled at this time but should be monitored. Yes go to Step 19 / No go to Step 18

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

11 of 14

Rev 12/11/09

UC SF
ID 18

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

Step Monitor the Error Monitor the problem using the following tools: Log files Scripts Performance tools

Responsibility Problem Owner(s) Problem Manager

Alert the Problem Manager if the problem recurs or there is a change in the impact of the problem. Go to Step 16 19 RFC Required? Yes go to step 20 / No go to step 21 20 Create Request for Change Go to Perform Change Management Process Perform Change Management Process Hand-off 21 22 Implement Solution Monitor Resolution Monitor the resolution after implementation to make sure that problem does not recur. Validate that there is a reduction in occurrences and severity and that other problems do not occur as a result of the fix. 23 Error Resolved? Yes go to Step 24 / No go to Step 8 24 Permanent Solution? Yes go to Step 25 / No go to Step 8 25 Update and Close Problem Update and close Problem, Known Error and associated Incident records in Remedy. Communicate problem resolutions and known errors to support team incident managers. 26 Conduct Review Conduct Post-Implementation Review (PIR) to understand what was done well, what was done badly, how to do better next time, and how to prevent recurrence of the failure. Problem Manager Problem Owner(s) Problem Owner(s) Problem Owner(s) Problem Owner(s) Problem Owner(s) Problem Owner(s) Problem Owner(s)

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

12 of 14

Rev 12/11/09

UC SF

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

Proactive Problem Management 27 Collect, Review and Analyze Data Possible data sources include: Incident data Problem data Known errors Industry data Performance data System management data Security data User data Problem Management Team

Analyze data for similarities, repeat occurrences and trends. Review performance and industry data looking for potential problems and best practices solutions. 28 Identify IT Enterprise Issues: People, Process, Technology Identify specific components in the enterprise that are causing problems such as: 29 Failing or outdated equipment Insufficient CPU, memory or storage Poorly written code Incorrect configuration Inadequate user training Problem Management Team Problem Management Team

Create and Review Problem Create a Problem Report in Remedy. Go to Step 6

Provide Management Reporting 30 Generate Updated Problem Report Provide reports on open problems, resolved problem, and know errors. Provide reports on problem management service levels and service metrics. 31 Distribute Problem Reports, as required Distribute reports on open problems, resolved problem, and know errors to the service desk and incident managers. Distribute reports on problem management service levels and service metrics to management. Problem Manager Problem Manager

7. EXIT CRITERIA
Known Error database updated Request for Change (RFC), if required Problem records updated with known errors, solutions and / or workarounds Closed problem records once root cause is eliminated Management information

8. PROBLEM MANAGEMENT SERVICE LEVELS


An analysis of incidents and identification of problems will occur at least monthly Reporting of problem management activity and performance metrics will occur at least monthly

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

13 of 14

Rev 12/11/09

UC SF

Office of Academic and Administrative Information Systems (OAAIS)

Problem Management Process

9. PROBLEM MANAGEMENT SERVICE METRICS


Problem management service metrics will be collected and calculated for each support group, unit, and for OAAIS overall.
Category Number of problems Description Measures problem management workload and effectiveness in resolving open problems. Measures effectiveness of the problem management process in resolving problems. Reductions indicate better processes, tools and training are being used. Increases indicate an ineffective process or that problem management may be under-resourced. Indicator of the time and re-work eliminated by managing and resolving problems. An increase in the number of resolved incidents should improve service levels and customer satisfaction. Measure Number open and resolved problems. Target Establish baseline of problem management volume. Monitor trends over time. Establish problem close baseline. Monitor trends over time.

Average time to close a problem

Number of days from problem creation to problem close for problems closed in the current period.

Number of incidents resolved by Known Errors

Number of incidents that are closed by solutions registered in the Known Errors database.

Establish incidents with Known Errors baseline and monitor trends over time.

UCSF Internal Use Only Problem Mgmt 12-11-09.doc

14 of 14

Rev 12/11/09

You might also like