Professional Documents
Culture Documents
ITIL FOUNDATION
- Janeiro 2004 -
ndice
Introduction .....................................................................................................................................................................................6 IT Service Management...................................................................................................................................................................8 IT Service Management Background.......................................................................................................................................8 Services and Quality............................................................................................................................................................10 Quality Assurance...............................................................................................................................................................11 ISO-9000.............................................................................................................................................................................13 Organisational Maturity.......................................................................................................................................................14 CMM...................................................................................................................................................................................15 Organisation and Policies..........................................................................................................................................................16 Vision, objectives and policies............................................................................................................................................16 Planning Horizon.................................................................................................................................................................17 Culture.................................................................................................................................................................................18 Human Resource Management............................................................................................................................................19 IT Customer Relationship Management..............................................................................................................................20 Processes..................................................................................................................................................................................21 Processes and departments...................................................................................................................................................22 IT Service Management.......................................................................................................................................................23 Introduction to ITIL.......................................................................................................................................................................24 Background...............................................................................................................................................................................24 Advantages to the Customer/End User:...............................................................................................................................25 Advantages to the IT Organisation:.....................................................................................................................................25 Potential disadvantages:.......................................................................................................................................................26 Organisations............................................................................................................................................................................26 OGC (CCTA)......................................................................................................................................................................26 ITSMF.................................................................................................................................................................................27 EXIN and ISEB...................................................................................................................................................................27 The ITIL Books........................................................................................................................................................................28 ITIL (IT Infrastructure Library)...........................................................................................................................................28 Business Perspective............................................................................................................................................................29 Service Delivery..................................................................................................................................................................31 Service Support...................................................................................................................................................................33 IT Infrastructure Management.............................................................................................................................................35 Applications Management...................................................................................................................................................36 Management and Organisation............................................................................................................................................36 Planning and Implementation..............................................................................................................................................37 Service Desk .......................................................................................................................................................................................................38 Introduction..............................................................................................................................................................................38 Objective ..................................................................................................................................................................................39 Process Description ..................................................................................................................................................................39 Activities ..................................................................................................................................................................................39 Activities.............................................................................................................................................................................40 Incident control....................................................................................................................................................................40 Roles ........................................................................................................................................................................................41 Relationships ...........................................................................................................................................................................41 Benefits ....................................................................................................................................................................................42 Summary.............................................................................................................................................................................43 Common Problems ..................................................................................................................................................................43 Metrics .....................................................................................................................................................................................45 Service Desk Structure - Best Practice......................................................................................................................................45 Structure of a Service Desk.................................................................................................................................................45 Centralised Service Desk.....................................................................................................................................................46 Virtual Service Desk............................................................................................................................................................47 Essential Terms.........................................................................................................................................................................48 Incident Management .......................................................................................................................................................................................................49 Introduction..............................................................................................................................................................................49 Objective ..................................................................................................................................................................................50 Process Description ..................................................................................................................................................................50 Activities ..................................................................................................................................................................................51 Incident detection and recording..........................................................................................................................................51 Classification and initial support..........................................................................................................................................52 Investigation and diagnosis..................................................................................................................................................52 ............................................................................................................................................................................................53
-2-
-3-
-4-
-5-
Introduction
In recent decades IT developments have changed the way that most businesses operate. The changes are most evident in the various business processes of any organisation. Examples of business processes are "the sales process" (eg. Marketing generates leads and sends through to Sales, Sales develops relationships and prepares proposals, Administration prints and sends the material to the client and ensures that there is an action against the sales person to follow up the proposal, etc. etc.). All of these business processes rely a great deal on computer based tools and the underlying technology. Since the introduction of the PC, LAN, client/server technology and the Internet, organisations can bring their products and services to markets faster than ever before. These developments are responsible for the transition from the industrial to the information age. In the information age, everything has become faster and more dynamic. Traditional hierarchical organisations often find it difficult to respond to rapidly changing markets, and therefore there is now a trend towards less hierarchical and more flexible organisations. The emphasis is now on horizontal processes, and decision-making authority is increasingly granted to personnel at a lower level. IT Service Management operating & tactical processes were developed against this background - where the process is paramount and the focus moves away from the "silo" departmental or functional structure.
In the 1980s, the quality of the IT services provided to the British government lead the CCTA (Central Computer and Telecommunications Agency as it was referred to - now the Office of Government Commerce, OGC) was instructed to develop an approach for efficient and financially effective use of IT resources by ministries and other British public sector organisations. The aim was to develop a framework/methodology/approach that was vendor/supplier independent. This resulted in the Information Technology Infrastructure Library (ITIL). ITIL v1 grew from a collection of best practices observed in the IT service industry. The ITIL framework provides a detailed description of a number of important IT practices, with comprehensive checklists, tasks, procedures and responsibilities which can be tailored to any IT organisation. Where possible, these practices have been defined as processes covering the major activities of IT service organisations. The broad subject area
-6-
Special note: The most widely known "other frameworks" that have been based on ITIL is the Microsoft Operations Framework. Microsoft do not try and conceal their proprietary framework has it's basis in ITIL. They in fact praise ITIL as an excellent starting point for organisations with Microsoft Environments. What Microsoft have done however, is extend ITIL and create a series of other processes and specific concepts.
ITIL is often referred to as Best Practice, although the relatively new term of "good practice" is starting to be widely used.
Note: The term "best practice" will often spark heated debate among some IT professionals. If you are not certain then the term "good practice" may be a safer option to use. The ITIL material claims that it is Best Practice.
The broader adoption of ITIL has been hampered by the lack of a basic, but effective introductory textbook. This course is the electronic version of this missing text. The course is beneficial for anyone involved in IT Service Management.
This edition of this Course is based upon The Art of Service's Course material, developed as an introduction to IT Service Management. That work was based on management summaries and descriptions in official ITIL publications. Given the desire for a broad consensus in the ITIL field, new developments, additional material and contributions from ITIL professionals are welcome. They will be discussed by the editors and where appropriate incorporated into new editions. Given the rapid changes in this field, the ITIL books do not always describe the latest developments. This is because ITIL is primarily a collection of best practices taken from a variety of people in a wide cross section of industry. When writing this Course we aimed to
-7-
Before you begin: Thank you for making the choice to study with us. This course is prepared and supported by experienced and fully qualified IT Service Managers. You can ask for help and explanations at any time. We have set tests throughout the material to ensure that you are getting the best value for the money you pay. We know that you will enjoy the journey you are about to begin. We hope to hear from you in the future. We like to get the stories about how this learning exercise helped you to make significant improvements in your own IT Service Management challenges.
IT Service Management
-8-
This chapter introduces the following subjects: The section on the provision of services and quality addresses the relationship between the quality experienced by the customer's organisational end users, and quality management by the provider of the IT services. The section on organisation and policies addresses concepts such as vision, objectives, and policies and discusses issues such as planning, corporate culture and Human Resource Management. This section also discusses the coordination between the business processes of a company and the IT activities. The section on process management looks at the control of IT service processes.
-9-
- 10 -
The quality of a service refers to the extent to which the service fulfils the requirements and expectations of the customer. To be able to provide quality, the supplier should continuously assess how the service is perceived and what the customer expects in the future. Another customer may well consider what one customer considers normal as a special requirement. Eventually a customer will get used to something considered special at the start. The results of the assessment can be used to determine if the service should be modified, if the customer should be provided with more information, or if the price should be changed. Quality is the totality of characteristics of a product or service that bear on its ability to satisfy stated and implied needs (ISO-8402). Reasonable costs may be considered as a derived requirement. Once it has been agreed on what is to be expected of the service, the next step is to agree on what it may cost. At this stage the service provider has to be aware of the costs they incur, and the current market rates for comparable services. A customer will be dissatisfied about a service provider who occasionally exceeds the expectations but disappoints at other times. Providing a constant quality is one of the most important, but also one of the most difficult aspects of the service industry. For example, a restaurant will have to purchase fresh ingredients, the chefs will have to work together to provide consistent results, and hopefully there are no major differences in style among the waiting staff. A restaurant will only be awarded three Michelin Stars when it manages to provide the same high quality over an extended period. This does not often happen: there are changes among the waiting staff, a successful approach may not last, and chefs leave to open their own restaurants. Providing a constant high quality also means that the component activities have to be coordinated: the more efficiently the kitchen operates, the more quickly the guests can be served. Therefore, when providing a service, the overall quality is the result of the quality of a number of component processes that together form the overall service. These component processes formed a chain; a series of linked activities. Effective coordination of the component processes requires not only high quality at each stage, but also consistent quality.
Quality Assurance
Supplying products or services requires activities. The quality of the product or service depends largely on the way in which these activities are organised. Demings quality life cycle provides a simple and effective model to address quality. The model assumes that to provide appropriate quality, the following steps must be undertaken repeatedly:
- 11 -
Plan: Who should do what, when, how, using what? Do: Implementation of the planned activities. Check: Determine if the activities provided the expected result. Act: Adjust the plans based on information gathered while checking. To be able to make use of this life-cycle approach, the activities of supplying products and services must be divided into processes, each with their own plans and opportunities for conducting checks. It must be clear who is responsible in the organisation and what authority they have to change plans and procedures, not for only for each of the activities, but also for each of the processes. Quality management is the responsibility of everyone working in the organisation providing the service. Every employee has to be aware of how their contribution to the organisation affects the quality of the work provided by their colleagues, and eventually the services provided by the organisation as a whole. Quality management also means continuously looking for opportunities to improve the organisation and implementing quality improvement activities. Quality assurance is a policy matter within the organisation. It is the whole of the measures and procedures, which the organisation uses to ensure that the services provided, continue to fulfill the expectations of the customer and the relevant agreements. Quality assurance ensures that improvements resulting from quality management are maintained. A quality system is the organisational structure related to responsibilities, procedures and resources for implementing quality management. The ISO 9000 series of standards is often used to develop, define, assess and improve quality systems.
- 12 -
Dr. Edward Deming was an American statistician brought to Japan by general Douglas MacArthur after the Second World War to help rebuild the destroyed economy. He had developed theories about the best possible use of expertise and creativity in organisations in the United States in the 1930s, but because of the Depression his ideas were not accepted in the US. However, his optimisation methods were successfully adopted in Japan. Some of Demings typical statements: The customer is the most important part of the production line. It is not enough to have satisfied customers, the profit comes from returning customers and those who praise your product or service to friends and acquaintances. The key to quality is to reduce variance. Break down barriers between departments. Managers should learn to take responsibility and provide leadership Improve constantly. Institute a vigorous program of education and self-improvement. Institute training on the job. The transformation is everybody's job.
ISO-9000
Some organisations require their suppliers to hold an ISO 9001 or ISO 9002 certificate. Such a certificate proves that the supplier has an adequate quality system whose effectiveness is regularly assessed by an independent auditor. ISO is the International Standards Organisation. A quality system that complies with the ISO standard testifies that the supplier has taken measures to be able to provide the quality required by their customers;
- 13 -
An ISO certificate does not provide an absolute guarantee about the quality of the service provided, however, it does indicate that the supplier takes quality assurance seriously and is prepared to discuss it. The new ISO 9000 series of standards, ISO-9000-2000, puts even greater emphasis than the previous standard on the ability of an organisation to learn from experience and to implement continuous quality improvement.
Organisational Maturity
Experience with improving the quality of IT services has shown that it is rarely sufficient to simply define current practices. The causes of a mismatch between the service provided and the customers requirements are often related to the way in which the IT organisation is managed. A permanent quality improvement focus demands a certain degree of maturity of the organisation. The European Foundation for Quality Management (EFQM) model can be useful in determining the maturity of an organisation. It identifies the major areas to be considered when managing an organisation. Demings Quality Life-Cycle is incorporated in the EFQM model. Based on the outcomes from "result areas" actions are taken (strategy, policies). These actions serve to underpin the planning (e.g. the structure of the processes), that should lead to the desired results. The EFQM identifies nine areas. In 1988 fourteen large European companies, with the support of the European Commission, set up the European Foundation for Quality Management.
- 14 -
CMM
In the IT industry, the process maturity improvement process is best known in the context of the Capability Maturity Model (CMM). The Software Engineering Institute (SEI) of Carnegie Mellon University developed this process improvement method. CMM is concerned with improving the maturity of a software creation processes. CMM includes the following levels:
Initial - the processes occur ad hoc. Repeatable - the processes have been designed such that the service quality should be repeatable. Defined - the processes have been documented, standardised and integrated. Managed - the organisation measures the results and consciously uses them to improve the quality of its services. Optimising - the organisation consciously optimises the design of its processes to improve the quality of its services, or to develop new technology or services.
Maturity models based on the CMM levels of maturity have also been developed for IT Service Management. When assessing the maturity of an organisation, we cannot restrict ourselves to the service provider. The level of maturity of the customer is also important. If there are large differences in maturity between the supplier and the customer, then these will have to be considered to prevent a mismatch in the approach, methods and mutual expectations. Specifically, this affects the communication between the customer and the supplier. It is advisable to bring both organisations to the same level of development, and to operate at that level, or to adjust the communication in line with the lower level.
- 15 -
- 16 -
Planning Horizon
When considering the policies and planning of an IT department, we should be aware of the links between planning for the business as a whole and the application systems and the technical infrastructure used by the business. When planning the network and applications of a business, the IT department will have to stay ahead of the overall planning to ensure that the business has an IT infrastructure in which it can develop and grow into. The figure below gives an example of the links between the various plans.
- 17 -
Technical infrastructure has the longest planning horizon and in its support role it has fewer clear links with the substantive business activities. It takes time to develop a technical infrastructure and the fact that information systems and the business depend on the technical infrastructure, limits the speed at which changes can be implemented. Furthermore, developing a technical infrastructure demands significant investment and the period over which it can be depreciated has to be considered.
The planning horizon is shorter for applications as they are designed for specific business purposes. Application life cycle planning is primarily based on the business functions to be provided by the system, after which the underlying technology is considered.
Business plans, based on the organisations strategy, normally cover one calendar or
financial year. Budgets, planning and progress reports all fall within this period. In some markets, the planning cycle time has become even shorter as the cycle time for product development has also decreased. Planning should address four elements: Time - this is the easiest factor to determine. It is defined by a starting date and ending date, and is often divided into stages. Quantity - the objectives have to be made measurable to monitor progress. Terms such as improved and quicker are insufficient for planning purposes. Quality - the quality of the deliverables (results) should be appropriate for the objective. Costs and revenues - the deliverables must be in proportion to the expected costs, efforts and revenues. Differences between the planning horizons occur not only between areas, but also between the various levels of activities and processes (strategic, tactic and operational).
Culture
Organisations that want to change (for example to improve the quality of their services) will eventually be confronted with the current organisational culture. The organisational culture, or corporate culture, refers to the way in which people deal with each
- 18 -
- 19 -
- 20 -
Processes
When arranging activities into processes, we do not use the existing allocation of tasks, or any existing departmental (functional) divisions. This is a conscious choice. By opting for a process structure, we can often show that the certain activities in the organisation are uncoordinated, duplicated, neglected, or unnecessary. A process is a logically related series of activities for the benefit of a defined objective. Instead, we look at the objective of each process and its relationships with other processes. A process is a series of activities carried out to convert an input into an output. We can label the input and output of each of the processes with quality characteristics and standards (that is, what is expected to go into and what is expected to come out of the particular process). These characteristics and standards provide information about the
- 21 -
- 22 -
IT Service Management
IT Service Management is primarily known as the process and service-focused approach of what was initially known as IT Management. In this chapter we have explained that processes should always have a defined objective. The objective of IT Service Management processes is to contribute to the quality of the IT services. Quality management and process control form part of the organisation and its policies. In a process-focused approach we also have to consider the situation within an organisation (policies, culture, size, etc.). ITIL, the best known approach to IT Service Management, does not dictate how an organisation should be structured. ITIL cleverly describes the relationships between the activities in processes, which are relevant to any organisation. ITIL also provides a framework that allows experiences to be shared in different organisations, as it is a common language.
- 23 -
Introduction to ITIL
This chapter describes the structure and objectives of the IT Infrastructure Library (ITIL) and the organisations that contribute to maintaining ITIL as the best practice standard for IT Service Management.
Background
ITIL was developed due to the fact that a growing number of organisations are becoming increasingly dependent on IT to help fulfill their corporate objectives. This increasing dependence has resulted in a growing need for IT services of a quality corresponding to the objectives of the organisation, and which meet the requirements and expectations of the customer. Over the years, the emphasis has shifted from the development of IT applications to the management of IT services. An IT application (sometimes referred to as an information system) only contributes to realising corporate objectives if the system is available to users and, in the event of fault or necessary modifications; it is supported by maintenance and operations. In the overall life cycle of IT products, the operations phase expenditure amounts to about 70 to 80% of the overall time and cost, the rest is spent on product development (or procurement). Effective and efficient IT Service Management are essential to the success of IT applications. This applies to any type of organisation, large or small, public or private, with centralised or decentralised IT services, with internal or outsourced IT services. In all cases, the service has to be reliable, consistent, of a high quality, and at an acceptable cost. IT Service Management addresses the provision and support of IT services tailored to the needs of the organisation. ITIL was developed to disseminate IT Service Management best practices systematically and cohesively. The approach is based on service quality and developing effective and efficient processes. ITIL offers a common framework for all the activities of the IT department, as part of the provision of services, based on the IT infrastructure. These activities are divided into processes, which provide an effective framework for further enhancement of IT Service Management. Each of these processes covers one or more tasks of the IT department, such as service development, infrastructure management, and supplying and supporting the services. This process approach makes it possible to describe the IT Service Management best practices independently from the actual organisational structure of the department. Many of these best practices are clearly identifiable and are indeed used, to some extent, in most IT organisations. ITIL presents these best practices coherently. The ITIL
- 24 -
- 25 -
Potential disadvantages:
The introduction can take a long time and significant effort, and requires a change of culture in the organisation. An over ambitious introduction can lead to frustration because objectives are never met. If process structures become an objective in themselves, the service quality may be adversely affected. In that case, procedures become bureaucratic (obstacles that are avoided where possible). There is no improvement due a lack of understanding about what processes should provide, what the performance indicators are, and how processes can be controlled. Improvement in the provision of services and cost reductions are not visible. A successful implementation requires the involvement and commitment of personnel at all levels in the organisation. Leaving the development of the process structures to a specialist department may isolate that department in the organisation and it may set a direction that is not accepted by other departments. If there is insufficient investment in support tools, the processes will not be done justice and the service will not be improved. Additional resources and personnel may be needed if the organisation is already overloaded by routine IT Service Management activities. ITIL was obviously developed to capitalize on the advantages. It is appropriate to recognise and acknowledged that there can be problems with an adoption of ITIL practices. However, the ITIL Framework itself provides a lot of suggestions on recognizing and therefore preventing such problems, or how to solve them should they occur.
Organisations
OGC (CCTA)
ITIL was originally a CCTA product. CCTA was the Central Computer and Telecommunications Agency of the British government. On the 1st April 2001, the CCTA was
- 26 -
ITSMF
The Information Technology Service Management Forum (ITSMF), originally known as the Information Technology Infrastructure Management Forum (ITIMF), was set up in the UK in 1991. The Dutch ITSMF (ITSMF Netherlands) was the next chapter, set up in November 1993. In 2001 it had over 500 members organisations, both suppliers and user groups. There are now ITSMF chapters in countries such as South Africa, Belgium, Germany, Austria, Switzerland, the United States, and Australia, who participate in the ITSMF International group.
The itSMF promotes the exchange of information and experiences that enable IT organisations to improve the services they provide. It organises symposiums, congresses, special subject evenings, and other events about current IT Service Management subjects. Working parties also contribute to the development of the subject. The association publishes a newsletter and operates a web site with information about its activities (http://www.itsmf.com).
The certification system is based on the requirements for effectively fulfilling the relevant role within an IT organisation. To date, certificates have been awarded to over 30,000 IT professionals in over 30 countries.
- 27 -
- 28 -
In this chapter, we will introduce the ITIL series of publications using the main elements of the ITIL puzzle. By the end of 2002 the original set of books, each on a specific aspect of IT Service Management, should have been replaced by six new ITIL books, as have the books on Service Support and Service Delivery. However, many of the best practices to be described in the new books are also included in the current ITIL series. For more information www.itil.co.uk
Business Perspective
The ITIL books in the Business Perspective Set describe many issues related to understanding and appreciating IT services as an integrated aspect of managing a business.
- 29 -
- 30 -
- 31 -
- 32 -
Service Support
The ITIL book on Service Support describes how a customer can get access to the appropriate services to support their business. This book covers the following subjects: Service Desk Incident Management Problem Management Configuration Management Change Management Release Management
Service Desk The Service Desk is the initial point of contact with the IT organisation for users. Previously, the ITIL books referred to it as the Help Desk. The major task of the Help Desk was recording, resolving and monitoring problems. A Service Desk can have a broader role (for example receiving Requests for Change) and it can carry out activities belonging to several processes. The new book on Service Support now distinguishes between the Service Desk (i.e. as a function or organisational unit), and processes such as Incident Management, Configuration Management and Change Management. Incident Management The distinction between "incidents" and "problems" is possibly one of the best-known discussion points in the ITIL field. There are clear differences and by understanding both processes the reasons become very clear. Although the difference may be confusing there is a major advantage in that a distinction is made between the rapid return of the service (the goal of incident management) and identifying and remedying the cause of an incident (the goal of problem management)
- 33 -
Note that your end users may still refer to what they experience as a "problem". It may not be appropriate to educate your users that what they are experiencing is in fact an "incident". This is a classic example of when the framework needs to be interpreted sensibly.
Configuration Management Configuration Management addresses the control of a changing IT infrastructure (standardisation and status monitoring), identifying configuration items (inventory, mutual links, verification and registration), collecting and managing documentation about the IT infrastructure and providing information about the IT infrastructure to all other processes. Change Management Change Management addresses the controlled implementation of changes to the IT infrastructure. The objective of the process is to determine the required changes, and how they can be implemented with a minimum adverse impact on the IT services, by effective consultation and coordination throughout the organisation. Changes are made in consultation with the status monitoring activities of Configuration Management, the customer organisation, Problem Management and several other processes. Changes are implemented by following a path of definition, planning, building and testing, accepting, implementing and evaluating. Release Management The actual implementation of changes is activities. Both hardware and software clients) changes are often planned on the Management is to control the distribution testing and storage. often carried out through Release Management (central processing, data communications and basis of releases. The main objective of Release of hardware and software, including integration,
Release Management ensures that only tested and correct versions of authorised software and hardware are provided. Release Management is closely related to Configuration Management and Change Management activities.
- 34 -
Network Services Management The Network Services Management process addresses planning and controlling communications networks. These include telephone systems, LAN and WAN networks. The ITIL module for Network Services Management also addresses the long-term communications needs of the organisation. In essence, it describes how the ITIL best practices can be applied in a network environment. Operations Management Computer Operations Management (Computer Operations) addresses the management of computer hardware and systems software, including mainframes and midrange systems, but also file servers, to ensure that the agreed Service Levels are provided. The ITIL book of the same title concentrates on production tasks in an environment of large computer systems. Management of Local Processors The Management of Local Processors process addresses management operations at decentralised sites. The objective is to support the provision of IT services at the user site. This specifically includes making agreements about the activities of various processes (especially the processes to support IT services) when IT services are provided at multiple locations. A good definition of responsibilities is important to optimise the service. Computer Installation and Acceptance Computer Installation and Acceptance primarily concerns guidelines for planning the acceptance, installation and eventual removal of large computer hardware in the IT infrastructure. These guidelines are a development/extension of the activities in the Change Management and Release Management processes.
- 35 -
Applications Management
The ITIL book on Applications Management will address the relationship between management and the software lifecycle. This includes issues such as Software Lifecycle Support and testing IT services. A major issue in Applications Management is effectively responding to changes in the business. Clearly defining the requirements and implementing a solution that meets the needs of the customer organisation is paramount. Software Lifecycle Support Software Lifecycle Support aims to define the approach for supporting the entire software lifecycle, in consultation with those responsible for software development. The way in which software is designed, built, tested, introduced, operated, maintained, and eventually decommissioned, is extremely important in IT services. In every stage of the software lifecycle, there have to be agreements between those developing and those operating the IT infrastructure. The selection of Software Lifecycle models can have a significant impact on the IT services. Testing an IT Service for Operational Use The objective of testing an IT Service for Operational Use is to ensure that the proper operation of new or modified IT services is tested before they enter operations. A system test, installation test and acceptance test are undertaken to determine if the developed application works, is correctly installed, interfaces with the rest of the IT infrastructure, and offers the "users" the functions agreed with the "customer".
- 36 -
- 37 -
Service Desk
Introduction
Here is a familiar situation..... You have just started a complicated job, which requires all your concentration. The phone rings....Someone is facing computer difficulty!!! The printer is not working. You solve the issue and just when you are back into the job Someone walks into your office asking when his or her expected upgrade will be done. 1 hour later you still havent started your job..
- 38 -
Objective
The Service Desk offers "first line" support to users. Users need help if they are not sure how to behave in a specific situation when using IT services or when they need assistance to solve a particular issue involving IT. As well, the Service Desk is the central point of contact where incidents or inaccuracies in IT systems can be reported. The Service Desk is the face of the IT department to the clients. Furthermore, the Service Desk is an important source of management information.
Objectives of the Service Desk: To provide a single point of contact for Customers To facilitate the restoration of normal operation service with minimal business impact on the Customer within agreed service levels and business priorities Process Description
The Service Desk is not regarded as a process within ITIL but as a function. As IT has become a greater part of business over the years the role of the Service Desk has become crucial. Businesses relies on there IT service to stay on top of the market and be competitive. The service provided by the Service Desk tends to be broader then just the IT part of business hence the change in name from Helpdesk (which was more IT related) into Service Desk. It plays a vital role in IT Service management as from a customer point of view the service desk is the IT Service Provider and therefore plays a critical part in how the customers perceives the IT organisation as a whole. Among the activities performed by the Service Desk are Incident recording and Incident Control. This used to be part of the Helpdesk Process but is now included in the process called Incident Management (covered in a later module).
Activities
The Service Desk has a number of primary responsibilities. These are:
- 39 -
Activities
1. Keep users informed by.... 2. Providing information, in the form of: The status of their incident Planned changes in IT Service Likely disruptions in the IT Service
Any changes or additions in the services that are provided or the service levels. The obvious benefits in proactively providing service information is that: If the customers know before hand that a specific service (e.g. email) will be unavailable during the lunch hours you (a) reduce the amount of calls being received but also (b) reduce the impact of the disruption. The end result will be fewer angry or annoyed customers. 3. Provide management information By logging all calls the Service desk can provide information to Management about, the amount off calls per category, the amount of calls as a result of a change etc. But also in case of a disruption in the IT Service they are likely to be the first to know. This makes the Service Desk an excellent First Point of contact for users, customers and management alike.
The objective is to produce reports from which management can make decisions and measure performance based on agreed service levels and deliverables.
4. Conduct customer satisfaction analysis and surveys 5. Recording and controlling of incidents
Incident control
The Service Desk is responsible for recording all incidents and then controlling them. The Service desk can use different ways in recording the incidents: Phone E-mail Internet Fax Personal visit
- 40 -
Roles
The new Service Desk tends to be more then the just the place to lodge calls related to IT. It has a role to provide and improve the service to the business in general. The changing role is that the Service Desk is a more customer focused, whereas the traditional Help Desk tended to be more technical in nature. As there are different types of Service Desk models the skills required by the Service Desk staff also must be carefully analysed. Interpersonal skills are one of the more important ones. Technical skills become more important when the Service Desk becomes more skilled and aims to solve most of the incidents without rerouting them to higher levels of support.
Relationships
Being the single point of contact for IT Service, the Service Desk has a link with all processes within ITIL. With some processes the link is a clearer than others.
The Service Desk is, in fact, an operational aspect of the important process of Incident Management, e.g. incident control. The Service Desk registers and "controls" Incidents. Incidents can be related to Configuration Items. If this link is supported by software, a powerful aid in mapping/identifying weak links in the IT infrastructure evolves. This allows the Service Desk staff the to quickly solve incidents by searching on a Configuration Item, category and/or error code and applying a previously used solution. - 41 -
Material do Curso de ITIL Note: A Configuration Item (or C.I.) is discussed in the Configuration Management process section. A C.I. is an item that we want to store information about. In some cases the Service desk does some minor changes and so has a link with Change Management and Release Management.
The link between the Service Desk and Service Level Management can be illustrated as a result of the Service Desk monitoring Incident levels and reporting whether the IT service is restored within the limits defined in Service Level Agreements (SLA's). Service Desk will report to Service Level Management if IT Service is not restored within time frames and escalation procedures are properly defined and adhered to.
Benefits
The benefits from a properly implemented Service Desk flow across Users, Customers, IT Staff and the business as a whole. Note: The difference between a Customer and a User should be explained. A Customer is the person who may often be the representative of the business for the service provided and/or the person who funds the service. The user is the ultimate end user of the service provided.
Benefits for the Customer Easily defined metrics for measuring performance
- 42 -
Benefits for IT staff Being able to focus more on the job Greater efficiency Better use of the skills the staff has resulting in motivated personnel Improved team work
Benefits for the business Source of useful management information Internal cost savings and productivity efficiencies Meet customers business support requirements with responsiveness at all times when and wherever it is needed Better use of available resources
Summary
An effective Service Desk will deliver overall cost reduction and increases in staff morale, service reliability and identification of business opportunities. All this leads to increases in Customer satisfaction ratings, with the associated improvement in perception this brings.
Common Problems
There is no doubt in denying that along with any move towards improvement there will be problems and barriers to success. Recognising this in advance goes a long way to solving the problem when (not "if") it comes up. Users do not call the Service Desk, but try to go around it to the person they know, the one who helped them so well the last time.
- 43 -
Not all parties involved are informed about the Service provided and the Service Levels agreed upon, resulting in unrealistic expectations. This is a simple case of communication and making sure that that all procedures, Service levels etc are well documented and available to all parties involved. Also periodically revisiting and re-educating involved staff, BEFORE issues arise. This is not to say that staff will use this information against users who require service, it simply means that users approach the Service Desk with realistic expectations. Naturally, everyone's problem is critical when it happens - when a user is irate or stressed it is not appropriate to point out that the agreed service levels allow for a 48 hour response. This highlights the people skills that Service Desk staff need to have.
A well-marketed Service Desk is only half the job. Involve key customers in the process of implementing the Service Desk and reap the rewards from having their support in times of crisis.
- 44 -
Metrics
Metrics are essential in monitoring any IT Service provided. The Service Desk as a service for users is no different in this regard. Day to day reports can provide information on Calls and Incidents. For instance; How many and what kinds of incidents (classifications) the Service Desk solves at first point of call. How long calls last, how many there are and how long the waiting time is before a call is answered. This information can be extracted from appropriate tools or from PABX records.
If standards are set before the Service Desk starts operating one can monitor the progress of the Service Desk. The crucial factor is in defining what it is that should be measured. It is realistic to start with a very simple set of metrics and possibly this is a better approach as it means that there is no time lost in creating a long series of reports that add no value to a customer. The easiest way to begin to define what metrics are required is to look to the Service Level Agreements (SLA's) that should define the response times, etc. for the Service Desk from the Customer perspective. However, even if all Service Levels are met the most important measurement for any Service organisation is the perception of the Customers of the service provided. Therefore Customer satisfaction should be measured regularly.
The "hybrid" model is also a genuine Service Desk structure that uses a combination of two or more Service Desk structures.
- 45 -
- 46 -
- 47 -
Virtual Service Desk Hybrid models combine two or more of these particular structures into a customised solution for a particular organisation. This is a genuine structure as the ITIL Framework provides guidelines only for structure and is not a prescriptive solution book.
Interesting websites:
http://openview.hp.com/products/servicedesk/ http://www.interpromusa.com/hdicerti.htm http://www.interpromusa.com/The Integrated Service Desk.pdf http://www.itilworld.com/n-america/service-support_helpdesk.htm support/Service
Essential Terms
Escalation: When the time limit for resolving an incident has passed, the incident escalates into a problem (depending on the priority and impact of the incident) and a different level of support (Problem Management) comes into force and this: Edits the problem if necessary Determines the impact on the service delivery and with that, the priority. The Service Desk continually informs clients about the progress of their calls.
- 48 -
Incident Management
Introduction
The Incident Management process contains activities that are aimed at restoring an IT service following a disruption. The Service Desk is usually the owner for this process however all support groups across an IT organisation will play their part. Disruption of an IT service, questions about the functionality of an application or requests for advice are all regarded as Incidents that are dealt with by this process. Requests for Change are handled in a similar way as Incidents so they can also fall under Incident Management. A business may decide however to describe the handling of RFCs in a special procedure in order to keep the Incidents and RFCs separated. Note: An RFC is a Request for Change and will be dealt with in the Change Management process area. An RFC is the "trigger" that begins the Change Management process.
- 49 -
Objective
The objective of Incident Management is to restore normal operations as quickly as possible with the least possible impact on either the business or the user, and at a costeffective price. The definition of how quickly is quickly should not subject to interpretation. The timeframes for Incident resolution should be defined in the Service Level Agreements (SLAs) that exist between the IT Department and the customer. The speed of resolution will affect the cost. It is this cost-to-speed ratio that is often forgotten when a user faces problems. Issues that are low priority during negotiations are "somehow" escalated to the status of requiring high levels of attention when the issue occurs. Often support staff will simply respond to user pressure in such situations and immediately the expectation is adjusted and anything less than immediate response to this otherwise low priority issue is considered as poor service (the IT Support dilemma!).
Process Description
As with every process there is an Input and an Output. The main input to this process are incidents. As shown in below Incidents can come from many sources like users, management Information or infrastructure monitoring tools. The Input for Incident Management mostly comes from users, but can have other sources as well like management Information or Detection Systems. The outputs of the process are RFCs, resolved and closed Incidents, management information and communication to the customer. This concept is illustrated in the following diagram. The centre diamond shows the activities of Incident Management.
- 50 -
Activities
The activities included in Incident Management are: Incident detection and recording Classification and initial support Investigation and diagnosis Resolution and recovery Incident closure Incident ownership, monitoring, tracking and communication
- 51 -
Impact 'Impact' is a measure of the business criticality of an Incident or Problem. Often this equates to the extent to which an Incident can lead to degradation of agreed service levels. Impact is often measured by the number of people or systems affected. Criteria for assigning impact should be set up in consultation with the business managers and formalised in SLAs. When determining impact, information in the Configuration Management Database (CMDB) should be assessed to detect how many users will suffer as a result of the technical failure of, for example, a hardware component. The Service Desk should have access to tools that enable it rapidly to: o o o o o Assess the impact on users of significant equipment failures Identify users affected by equipment failure Establish contact to make them aware of the issue Give a prognosis Alert second-line (specialist) support groups, if appropriate
Urgency 'Urgency' is about the necessary speed in solving an Incident of a certain impact. A highimpact Incident does not, by default, have to be solved immediately. For example a User having operational difficulties with his workstation (impact 'high') can have the fault registered with urgency 'low' if he is leaving the office for a fortnight's holiday directly after reporting the Incident. Urgency is seen as to what degree the service is affected (stopped, partially affected, functionally changed). If a user calls with an Incident and they cant work (service stopped) then it is of greater urgency than a user calling to request a functionality change.
- 52 -
Incident closure
For the Incident Management process to be effective it is necessary that the Incidents closure be done properly. This step includes: o o Updating Incident details Communication with the user about solution
To ensure the solution provided meets the user needs they are the only person that can give the authority to close an Incident. The Incident record in the Service Desk tool should be closed so that accurate reporting can be carried out. An Incident will be closed as soon as the agreed service is restored. In some cases the Incident record is closed but a Problem record is still open (Refer to Problem Management for more information about a Problem record).
Roles
The role of Incident Manager in most organisations is assigned to the Service Desk Manager. The Incident Manager role includes responsibility for: Monitoring the effectiveness and efficiency of the process Controlling the work of the support groups Making recommendations for improvement Developing and maintaining the Incident Management system Reporting to management and other process areas
Support roles:
- 53 -
Relationships
Incident Management has a close relationship with other ITIL processes. Some of these inter-process relationships are described here.
Configuration management: Every Incident is connected to a Configuration Item (C.I.) stored in the CMDB. An incident will typically involve more than one C.I. The CMDB provides information about CIs and the parent/child relationships between them. This helps to determine the cause, solution and routing of an incident by tracing a fault back through the C.I. relationships. For example, if a user cannot access the Internet, by looking back through the parent/child relationships of that users PC could find that a Hub that the user connects to (parent of the child PC) is a potential C.I. that should be investigated. Problem Management Incidents with unknown causes are routed to Problem Management where they are processed. Known Errors, Work-arounds, Quick Fixes is given to Incident Management by Problem Management. Change Management This process can be the cause of Incidents if a Change is not implemented correctly. Therefore it is very important that Incident Management knows all planned changes so they can relate Incidents to a change and notify the Change Management process so that rollback plans can be implemented if necessary. On the other hand some Incidents will be solved by a Change, as will be the case when faulty equipment is replaced. Service Delivery Processes Incident Management provides and gets information from all the Service Delivery processes. Service Level Management, for example, is responsible for establishing service levels that relate to work done within the Incident Management process. The Service Desk will then report against these service levels.
- 54 -
Benefits
A well implemented Incident Management process will have easily visible benefits. Unlike some other ITIL processes where benefits may be hard for end users to identify, the benefits of good incident management will be felt by them directly. For customers Quick restoration of Service following an Incident Incidents are not lost or forgotten Up to date status of their Incident provided
For IT Organisation Remove the problem of duplication of effort (once an Incident is solved the resolution will be easily found and can be applied to future incidents if the re-occur or the resolution can be a starting point for a different but related incident). Clear view of the status and priorities of the Incidents Possibility to measure performance against SLAs. Higher user and customer satisfaction
For the business Prioritisation - the high impact, high urgency, incidents are the ones that jump to the front of the queue. Resulting in the least possible impact on the business activities. Quicker resolution of Incidents leading (productivity gains). Management information is provided.
Defining benefits is relatively easy. Realising benefits is difficult as during times of incident a lot of end users will not want to be convinced that their incident is not a high
- 55 -
Common Problems
We know there are many benefits of a good incident management process. Likewise, there can be some real "show stoppers". The following major obstacles if not dealt with will mean the process will be inefficient and ultimately unsuccessful. Critical Success factors for successful Incident Management: A CMDB needs to be set up before Incident Management is implemented. This makes the determination of impact and urgency a lot faster. The CMDB will ideally be an electronic system, but it can be a manual system. Note: The CMDB is the Configuration Management Data Base. The CMDB is covered in the Configuration Management process chapter. The CMDB holds information about C.I.'s (Configuration Items). A knowledge database. This database will hold Known errors, work arounds and resolutions. This will help Incidents to be resolved much faster and with less effort.
An Incident Management tool to record and monitor Incidents easily. (Preferably this tool is part of a complete Service Management tool that integrates the tools from all processes).
The challenge in the implementation of these tools and databases is not to let the work of setting up the system stand in the way of making progress. Any ITIL process can be started in a very simple form. The biggest challenge facing IT professionals is that it takes "discipline" to use the tools and procedures. The sooner the discipline of logging information, searching for solutions rather than re-working solutions can begin, the better. If people start to use the tools and start to see benefits in doing so, then so the proper "habits" are formed. It is then a relatively easy task to modify behaviours to use a different tool or introduce new features/functionality as tool development or tool selection progresses.
Metrics
Many metrics can be obtained from this process, some of the more notable and useful are:
- 56 -
Best practice
Incident Management tools:
There are many Service Management tools on the market that now align and provide functionality to support ITIL processes. These tools have many features, which assist in automating the process such as: Quick Call Logging (one click to log a call) Auto population of data Automated alerts from infrastructure devices and applications that automatically generate an Incident record Advanced knowledge bases Customisable reporting
Incident life Cycle The following diagram shows the activities throughout the Incident Management process and the status that each activity can be set at. Throughout the activities the continual issue surrounding ownership, monitoring and tracking must also be considered.
- 57 -
Interesting Websites
http://www.helpdeskinst.com/publications/practices.asp http://www.itil-service-support-management.com/Pages/Service support/Incident management/Incident_mgmt.htm http://www1.worldcom.com/au/resources/whitepapers/pdf/WorldCom_White_Paper_ On_eCRM.pdf http://www.itil.co.uk/online_ordering/serv_supp_graphs/incident_mngt.htm http://tools2manage-it.com/serv_mgt.php
Essential Terms
Incident "Any event that deviates from the (expected) standard operation of a system." An incident is often simply a user requesting help for something that is not working. For example I cant see my network drive, I cant access the Internet, I cant send email. It is any situation where something does not work and the specific details are not known. Work around It is possible for Problem Management to identify work-around in the investigation of problems. These should be made known to Incident Management so that they can be passed to the user until the permanent fix is implemented.
- 58 -
Problem Management
Introduction
Problems have a tendancy to always happen !!. No matter how well things are running. Even with the most reliable IT, the service delivery will be troubled by disruptions that cannot always be avoided. We have learnt that an incident is a deviation from standard operation. This means that users can face many incidents and a lot of the time they will face the same incident many times. A user calls with an "incident" - the Service Desk captures the call and gives great Incident Management support : "re-boot your PC and see if that fixes it". It does. The user is happy. The next day the same user calls with the same incident, with the same great incident support. "Re-boot your PC". On the third day the user does not call again, they just re-boot their PC and start to live with the issue. Then they start to tell other users - just reboot your PC that'll fix it. All of a sudden we have a plague of PC re-booting users !!! So how do we avoid the plague? By introducing Problem Management. The first incident that was fixed by re-booting the PC should have been passed to Problem Management. An example where Problem Management can make a difference: In an organisation a user calls the Service Desk with the complaint that his document is not printing. The Service Desk investigates the incident and sees that the print queue has the status On Hold. The Service Desk releases the queue, the document is printed and the Incident closed. A few minutes later another users calls the Service Desk Their document is not printing .. As this is a different person answering the phone he investigates the Incident and sees that the queue is on Hold, releases the queue and the incident is closed.. In the mean time users print their documents over and over again as they think theyve done something wrong. The next day users still call with the same problem, the document still not printing The Service Desk releases the queue when applicable, the document is printed and Incident is closed. If Problem Management were in place a problem would have been identified and recorded. The "Known Error" related to this problem would be found in the configuration of the Printer. The solution, to reconfigure the printer so the queue is automatically released, would be found and implemented. The stream of Incidents regarding this printer would cease. The releasing of the queue by the Service Desk would be used as a workaround to restore the IT service in the event of the printer facing a similar issue in the future.
Objective
The objective of Problem Management is to minimise the total impact of problems on the organisation. Problem Management plays an important role in the detection and repair of problems to prevent their reoccurrence. The following slide says this in a different way but also introduces the crucial element of proactive problem management.
- 59 -
Process Description
The Problem Management process is focused on finding weaknesses in the IT infrastructure and through the use of Change Management removing them so that future disruptions do not occur. The process focuses on finding patterns between incidents, problems and known errors. These three areas are key things to understand in this "root cause analysis". The basic principle is starting with many possibilities and narrowing down to a final root cause. Note: "Root Cause analysis" is often used interchangeably with Problem Management. The ITIL Framework doesn't prescribe what a process area should be called and Root Cause Analysis is fine. However, Root Cause Analysis is typically a reactionary exercise. ITIL's Problem Management caters for reactive work, but more importantly recognises the value of proactive problem management. We use Root cause analysis interchangeably with Problem Management. Incidents: An incident is defined as a deviation from the standard expected operation of a service. It is a general description of something that has gone wrong. It is not known what the exact cause is at this stage. For example users will call a Help Desk and say, I cant print, I cant access the Internet, I cant see my network drive. They expected to be able to do these things yet could not, so these are "incidents". Problem: A problem is the unknown underlying cause of one or more incidents. This is the second stage of "root cause analysis"/problem management. From the general incidents, more - 60 -
Material do Curso de ITIL investigation will uncover an underlying cause of these incidents. A network problem is a good example of a problem definition in this case. Users don't call saying I have a "network problem", they call and say "I can't save to my H: drive" or "I can't print or surf the web". IT staff then piece all these incidents together and identify that we are facing a "network problem". Root cause analysis has taken us closer to finding the root cause but not completely. A problem is then a more specific definition. Known Error: A Known Error is the final step in the root cause analysis process. A Known Error can be defined as, when the root cause of the problem is known. In our network problem example it is where the faulty equipment or system has been identified. This is the end of the root cause analysis process. Following the above example the Known error would be Router x is faulty. From the above we see the initial general issues being faced through to the final definition of the root cause. The following diagram illustrates this flow.
- 61 -
The following picture summarises this. The center diamond highlights the Problem Management activities which we will look at in the next module.
Activities
The ITIL Problem Management has four primary activities as follows:
Problem Control Error Control Proactive Problem management Completion of Major Problem Reviews
- 62 -
Classification of Problems o This activity centres on understanding what the impact on agreed service levels is of the problem. Classification of problems is similar to Incident classification (impact, urgency, priority).
Investigation and diagnosis of Problems o This is the step where we get to understand what it is that is causing the problem. This step is vastly different from Incident Management investigation where the focus is "rapid restoration" of service.
Error Control Error Control is the process in which the Known Errors are researched and corrected. The request for change comes from this sub-activity and is submitted to Change Management and then following approval the change is actioned. - 63 -
Proactive Problem Management The best problems are the ones that never happen. ! Proactive Problem Management focuses the analysis of data gathered from other processes and the goal is to define Problems. These problems are then passed off to Problem and Error Control procedures, as if they had happened. The activity includes:
Trend analysis Using data to highlight potentially weak components. Targeting preventative action Trend analysis can lead to identifying general problem areas.
The aim of proactive Problem Management is to redirect efforts away from always being reactive, to proactively preventing incidents occurring in the first place.
Completion of Major Problem reviews At the end of a major problem cycle, there should be a review to learn:
1. 2. What things were done right? What things should we have done differently?
- 64 -
Roles
Problem Manager role
The role of Problem Manager is responsible for: Developing and maintaining Problem Control and Error Control Assessing the efficiency and effectiveness of Problem Control and Error Control Providing management information Managing Problem Management personnel Obtaining the resources for the required activities Developing and improving Problem Control and Error Control systems Analysing and evaluating the effectiveness of Proactive Problem Management
Relationships
The Problem Management process has a close connection with the following ITIL processes" Incident Management: A very close and obvious link as we have learnt. Problem management aims to solve the root cause for the Incidents that are recorded by Incident Management. It is important that Incident Control provides accurate information so that Problem Control can solve the Known Errors easier. Problem management will supply Incident Management with workarounds and quick fixes where possible. Change Management: If Problem Management finds the solution to a Known Error they have to submit a RFC for the Change. Change Management is responsible for the implementation of the Change. When it is implemented they, together with Problem Management, review the Problem to verify that it is solved by the Change. This is called a Post Implementation Review after which Problem Management can close the problem record.
- 65 -
Benefits
Problem Management improves the IT service quality by resolving the root cause of incident(s). This leads to lower amounts of Incidents - benefiting users, customers, the organisation and the IT department: Advantages are: Better quality of IT Service Management Reliable IT Service results in a better reputation of the IT service Ability to learn from the past IT staff will be more productive
- 66 -
Common Problems
Common problems for Problem Management include: 1. Incident Management and Problem Management dont have well defined interfaces with each other. 2. Known Errors are not communicated to Service Desk/Incident Management 3. No Commitment from Management 4. Unrealistic expectations of the Problem Management process. The following slide raise these and other points of attention that have to be considered.
- 67 -
Metrics
Successful Problem Management can be measured by: Reduction in Incidents because the underlying causes are removed The time that is needed to resolve Problems The other costs that are incurred associated with the resolution
Within Problem management there is lot that can be measured. It depends on the scope of Problem management as to what is relevant. Some examples are: Time spent per organisational unit fixing problems Number of RFC's raised Ratio of proactive to reactive problem management
- 68 -
Best practices
Considerations for Problem Management, a high-pressure area for IT.
The variety, complexity, volume and difficulty of problems facing Support teams today compared to the "early days" (those of a decade ago) seem child's play. Why? Increases in user demand have led to vast numbers of PCs; distributed Client/Server networks; multi-site, multi-platform systems; cheap but complex software packages; all in addition to the traditional mainframe systems, and all needing the same high levels of technical support, and all to be delivered with a sensitive 'customer care' attitude. This increasing workload is threatening the whole stability of IT. If we don't find a lateral solution for managing it, demand will simply continue to grow at its present rate. The result? Support functions will soon dwarf the rest of IT in terms of number of personnel, running costs, and quality of expertise. And ignoring the problem, in the vain hope that it will go away, is even worse. Demand must be managed down, resources re-allocated, recurring problems eliminated, and difficulties anticipated and addressed before they become problems. Aggressive, effective and financially viable Problem Management is an integral step in achieving this. Otherwise IT will simply grind to a halt, grid locked and unable to function.
Note: the following web links provide some additional research material. The KepnerTregoe link is an interesting one. KT Analysis is a long serving tool that can be very useful in carrying out the Problem Management process activities. Interesting Websites
http://www.itilworld.com/n-america/service-support_probmanage.htm
- 69 -
Essential Terms
Work Around: A "quick-fix" solution to an incident, which will produce an acceptable outcome for a limited period. Proactive Problem Management: Activities aimed at removal of errors while the errors are in a state of inactivity. Key success factors in Problem Management:
Automated registration of incidents with correct classification. Setting attainable objectives of the Process. Depending on the size of the operation, not even treating this as a full time role. Good inter-process co-operation (especially Incident and Change).
Change Management
Introduction
As organisations become more dependent on IT services and technology changes rapidly succeed one another, the need for proper management and control of change grows with it. Many problems in the quality of IT Service Support emerge from changes in existing IT systems. The ITIL Change Management process is designed to act as a planning and control process. Proper planning and control ensures the implementation of change can take place without interrupting the operational IT service delivery. It is Monday July 1st. End of financial year. For many organisations it is the time of the year to close out and get final figures for the previous 12 months. Lots of reports being run, lots of heavier than normal requests on systems, lots of printing, etc. Imagine then a Service Desk where the phones start ringing at the start of the day. My report is not showing any figures. I cant find any details of last year. All figures for last year are lost. Mild panic quickly escalates to all out concern as no-one really knows what has happened. They all individually try and solve the incident of each user by investigating the software etc. Nothing seems to be wrong, with the system functionality, it's just that the reports are all blank !
- 70 -
Objective
For an effective and efficient IT service delivery it is necessary to have the capability to implement many changes correctly. Changes in reality often lead to (implementation) problems. The created problems are subsequently resolved by the implementation of a change, which in turn leads to more problems. Breaking this negative spiral is an important task for Change Management. ITIL's goal statement for Change Management is... "Assuring that standardised methods and procedures are in use for the efficient and timely implementation of all changes, in order to minimise the impact of change related problems on the quality of the IT service delivery". The goal also build an internal understanding of the "how and why" for the process (how = standardised methods and procedures, why = to minimise impact). Remember.... "Not every Change is an improvement but every improvement requires a change"
- 71 -
Process Description
The common trigger for Change Management is a Request for Change (RFC). RFC's come from within the IT organisation as well as from the Customers. Another trigger for change can be the Forward Schedule of Changes (FSC). This schedule is drawn up in advance in agreement with the customer. The FSC documents known change events or agreed windows of change, that can be used for unforeseen (nonurgent) changes. Other inputs for the process is CMDB information about the affected Configuration Items (C.I.s) and the relationships that exist between the affected CI's. This vital information contributes to the assessment that the Change Management process has to make about the impact (potential or otherwise) or a proposed change. The output of the process includes reports regarding the changes, triggers for Configuration Management to change the CMDB, triggers for Release Management to release, develop or implement new software or hardware and Change Advisory Board (CAB) agenda and planned actions. Note: The CMDB (Configuration Management Data Base) is discussed in the Configuration Management chapter. Change records can be held in the CMDB, which allows them to be "linked" to affected CI's or Configuration Items.
The Scope of Change Management is determined along with defining the scope of Configuration Management. If the Configuration Management process is to track details of hard disks and floppy drives, then replacing a hard disk counts as a change (albeit a "minor change").
- 72 -
Activities
The Change Management process includes the following activities: Recording Accepting Classifying Planning Co-ordination of activities Implementing Evaluating
Recording
Although this activity is not carried out by Change Management itself, it is the responsibility of Change Management to make sure all Changes are recorded correctly.
Accepting (Rejecting)
At this stage RFCs will be reviewed and accepted or rejected. Any rejection should always be communicated and explained. A reason for the rejection of an RFC might be that it is incomplete or illogical. Accepted RFC's then be classified.
Classifying
In this stage the RFC will be categorised and a prioritised. The category depends on the impact the change has and the resources needed to do the change. The priority is derived form the urgency and the impact of the change, along with knowledge that the Change Management process may have from other process areas, that the change requestor is not aware of.
Planning
The change will be planned and put on the Forward Schedule of Change (FSC), if appropriate. The Change Advisory Board (CAB) meets to review the FSC. The FSC will consist of: - Planning in time, people, and budget - Indication of consequences for other changes - Advice on whether the change should be a go or no-go
- 73 -
Evaluating
Each change (except minor standard changes) should be evaluated to see if the changes had the desired effect. The effort put into a post change evaluation will be dependant on the size of the change and the impact it had on the organisation (good or bad, what lessons can be learnt).
- 74 -
Roles
Change Manager
The Change Manger is responsible for: Processing of the RFCs, including filtering, accepting and classifying them. Planning, coordinating (with all parties) and the implementation of the Changes. Closure of RFC's Authorisation, after advice from the CAB or the CAB/EC Where appropriate, to obtain authority for Changes to proceed (the level of authority required will depend on the impact that the change is expected to have, it's cost and it's urgency). To issue the FSC's (Forward Schedule of Changes) via the Service Desk
Relationships
The Change Management process depends on the accuracy of the configuration data to ensure the full impact of making changes is known. There is a very close relationship between Configuration Management, Release Management and Change Management. The following diagram illustrates some of the major process relationships.
- 75 -
Advising the Service Desk of changes is crucial. Changes are nearly always first "discovered" in the Incident Management process, via the Service Desk. Also the Problem Management process can submit RFCs to solve Known Errors and sometimes this can cause a snow-ball effect, if the Configuration Management process is unable to explain what the affected components will be as a result of the change (including hardware, software, SLA's). Note: SLA's (Service Level Agreements) are discussed in the Service Level Management process. It is possible to store information about an SLA in the Configuration Management Database (CMDB). By doing this relationships can be created between hardware or software components and the SLA. So when a change is proposed to a component the linked SLA can be investigated to determine if the change will breach the SLA. The others processes are also linked to Change Management in the sense that they either request changes (Availability Management) or they will be consulted to determine the impact of changes (IT Service Continuity Management, Service Level Management and Capacity Management).
Benefits
Change Management is one of the ITIL processes that can often be not really liked. IT Staff have a tendancy to think that "it's only a small change, no-one will be affected". It is in these situations that most damage is often done. Discipline is required to adhere to the process. As an IT support person if you are told to implement the change without
- 76 -
Common Problems
Along with the benefits of the any process, we have to acknowledge the inherent problems as well. Change Management is a highly visible process, both within the IT Department and the business users and business customers. Note: The distinction between Customer and User is straightforward. The customer is the one paying for the service. The User is the ultimate end-user of the service. Tool Selection You need an appropriate tool to support the Change Management process. Ideally, a single tool will be able to accommodate the activities of a number of ITIL processes. With Change Management, it would be almost essential that the same tool be used as the Configuration Management Data Base. Defining the Scope If the changes that are controlled by Change Management are too wide (eg it includes password rests) the workload will become to high and people will try to bypass the process. Commitment IT staff might be reluctant to follow procedures because Change Management will be involved in so many aspects. It is important to make the IT Staff aware of the positive effect the Change Management process will have on the IT Service as a whole. It is also necessary to ensure that the team designing the process do not over engineer a solution. Start simple and build up complexity if it's (a) necessary and (b) value-adding.
Metrics
The beauty of all ITIL processes is that they can be measured. Measurement allows baseline setting and targets to be set for improvement.
- 77 -
Best practices
The following list of sites provide some excellent background reading on Change Management.
Interesting websites
http://www.itil-service-support-management.com/Pages/Service%20support/Change %20management/Change_mgmt.htm http://www.itilworld.com/n-america/service-support_changeman.htm http://www.infra.com.au/TuDelft.htm http://www.itil.co.uk/online_ordering/serv_supp_graphs/sschange.htm http://www.microsoft.com/technet/treeview/default.asp? url=/technet/itsolutions/ecommerce/maintain/operate/rmdotcom.asp http://www.atlsysguild.com/ http://www.guild.demon.co.uk/SpecTemplate8.pdf http://www.guild.demon.co.uk/SpecTemplate8.rtf..zip http://www.atlsysguild.com/GuildSite/Robs/Template.html
Essential Terms
Change Change to a system or service. CAB Representative group of stakeholders, who assess the change requests. The Change Advisory Board advises the change manager on whether the change should be accepted or rejected. Request for Change (RFC) All requests for modification of the managed infrastructure that are not Service Requests. Service Request Fully defined and approved changes, which are individually recorded, but not individually assessed by Change Management. These changes are made routinely.
- 78 -
Configuration Management
Introduction
Through the storage and management of data regarding the IT infrastructure the Configuration Management process gives the IT organisation greater control over all the IT assets. The more dependent on their IT systems organisations become, the more important Configuration Management becomes. It is, therefore, necessary to keep a register of all Configuration Items (C.I.'s) (IT Assets) within the IT infrastructure. Configuration Management aims to provide a "logical model" of the IT infrastructure by identifying, controlling, maintaining and verifying the versions of all C.I.'s.
Objective
The main objectives of the Configuration Management process are to: provide IT Management with greater control over the C.I.s (IT Assets) of the organisation.
- 79 -
Process Description
The Configuration Management Process could almost be considered as a pivotal process for all other (especially the Service Support) processes. Configuation Management is considered central and supportive to the other ITIL processes by providing information about the IT Infrastructure. Reminder Note: Service Support processes = Incident, Problem, Change, Configuration, Release Service Delivery processes = Service Level Management, Availability, Capacity, Financial, Continuity
- 80 -
A major input into the Process is from Change Management either requesting information about items that will be affected or advising the status of changed items. The process starts with the design, (Configuration Management Data Base) populating and implementing of the CMDB
It is the responsibility of Configuration Management to maintain the CMDB. Populating the CMDB can be a costly and lengthy exercise depending on the scope of IT infrastructure that is to be managed and the depth of detail about each item required (automated tools can play a large part here). The Outputs of the Process are reports to IT management and also the constant availability of Information that can be supplied from the CMDB to other processes.
Activities
The activities of the Configuration Management process are: Planning Identification Control Status accounting Verification and Audit
- 81 -
Identification: The identification activity involves the gathering of all C.I. information within the scope of the process. C.I. information is gathered either manually and/or by the use of automated tools. At the time of gathering this data each CI should be labeled for reference and control purposes. - 82 -
Material do Curso de ITIL Note: The labeling of IT Infrastructure can be incorporated into the Security Management process. Labeling techniques include visible labels, that include common contact numbers (eg. Service Desk), reference numbers and even hidden labeling (security paint that shows up under "black lights" and microchip identifiers that are not visible to the human eye). The information gathered will be governed by the scope, C.I. level and attributes decided upon.
Note: The attributes of a C.I. are the "things" about that C.I. that we want to record (eg. the attributes of a Personal Computer can be hard disk size, processor type, processor speed, Operating system version). Values are the quantifiable measurement of attributes (eg. the value of a hard disk size can be 3Gig or 8Gig, the value of a processor speed can be 1 GigaHertz or 10 GigaHertz) Before gathering any information control procedures and the Change Management process should be in place so that after information is gathered and populated into the CMDB changes to the infrastructure dont make it redundant. Note: The gathering of data can take several weeks or months. Control Before the CMDB is populated control procedures should be in place. It is vital that changes to the CMDB and the CIs within are only made with the proper authorisation. Procedures need to be set up so that all changes are always documented, for example with authorised RFCs. We can start to see the very strong relationship that Change and Configuration Management share. Status accounting Status accounting is the activity that records the current and historical states of a C.I. so every change to a C.I. is traceable. Status levels can be defined as part of the planning process (eg. On order, In use, Out of order, Under repair, Retired).
- 83 -
Roles
The Configuration Manager will assist in determining the scope and level of detail required in the process, implement procedures for interaction with other processes and take responsibility for the planning and population of the CMDB. The Configuration Librarian is the person who controls access to master copies of software and documentation. Like any librarian the focus is on physical items. These items will be held in the "definitive software library" (DSL). Note: In small organisations the role of the Configuration Manager and the Change manager can be combined.
- 84 -
Relationships
As indicated, the IT infrastructure forms the foundation of the IT organisation. All processes within ITIL therefore have links with Configuration Management or retrieve information from the Configuration Management Database. Change Management and Release Management however, have the closest relationship to Configuration Management and could even be considered as an integral part of it. The flow chart shows the relationships between the 3 processes and how the flows between the processes occur at every stage.
Benefits
Some of the benefits that come from implementing Configuration Management include: ability to provide Information to the other processes about C.I.s and the relationships that exist between them. contribution to IT Service Continuity planning control of the IT Infrastructure. Knowing where a C.I. is and whos responsible for it. efficient and effective Problem management efficient and effective processing of Changes as an insurance that legal obligations are being met
- 85 -
The following slide restates some of these benefits and introduces some new ones.
Common Problems
Problems that can prevent an effective implementation of Configuration Management are: Level of detail for the CIs is not right. If the level is too deep too much information is recorded which will take to much time, money and effort to maintain. However, if the level of detail is not detailed enough parts of a C.I. can be changed without anyone knowing. This can result in increased incidents and problems due to the difficulty in tracing the faulty component. Urgent Changes. Emergency changes often happen outside normal hours of operation. There may be no authorised person to record the changes in the CMDB. This can be combated through a solid procedure of post-change updating. Otherwise the overall reliability of the CMDB is compromised. Commitment: There needs to be a firm commitment from Management to implement this process, as it can be a high activity process. Discipline is required from IT Staff to ensure that changes to the infrastructure follow the appropriate steps to keep the CMDB accurate. Interaction with other processes. As Configuration Management relies on Change and Release Management to a large degree it is wise to implement these processes at the same time. Failing to do so might result in a non-reliable CMDB and an ineffective process. Control. There needs to be a process in place that secures the validity of the CMDB. For example users who can purchase software themselves via the Internet may
- 86 -
Metrics
Key performance indicators
The measurement of the Configuration Management process has many potential KPI's that can be analysed. To measure the effectiveness of Configuration management, realistic targets should be set. The targets can be changed over time to ensure improvement of the process. Result of audits. Number of unauthorised C.I.s, C.I.s not in use Number of changes that due to wrong Configuration information cause incidents or problems RFCs that were not completed successfully because of poor impact assessment, incorrect data in the CMDB, or poor version control The time a change takes from start to finish Software licences that have been wasted or not put into use
Other Indicators could include: The amount of calls per month that are solved whilst the User is on the phone using information from the CMDB. Reduction in Incidents and problems over time and the change in impact they have on the business
- 87 -
Best practices
The CMDB
Most organisations already use some sort of CMDB, in a spreadsheet or paper based. In most cases the CMDB is based on database technologies, which makes gathering information more user friendly. Information that can be gathered from a CMDB include: Information about C.I.'s List of C.I.s affected by a scheduled a Change All Requests for Change relating to one C.I. The history of a particular C.I. List of change and problem records associated with a C.I. List of C.I.'s affected by a Problem.
A CMDB also contains information on relationships between Incidents, Problems, Known Errors, Changes, Releases and C.I.'s. The CMDB can aid as a support tool in the creation and on-going maintenance of legal contracts. Note: Examples of "relationships" that can be defined are: Relies on.... o SLA "Provision of Banking Services" relies on Server 2 o SLA "Provision of Banking Services" relies on Printer 9 Is part of.... o Hard disk 12 is part of Server 2 Affects.... o SLA "Provision of Banking Services" affects Customer 11 o SLA "Provision of Banking Services" affects Customer 12 Is linked to... o Banking system is linked to Admin system Had ...... o Printer 9 had RFC 0013 applied o Printer 9 had RFC 0035 applied
An additional bonus is the use of the CMDB to cover the legal aspects associated with the maintenance of licences and contracts. The Definitive Software Library (DSL) is a storage place where all software versions are kept secure. New releases will be built on copies of the software from the DSL, not from the software that is being used in the production environment. The DSL is plays an important part in the Release management process and is discussed in more detail in that chapter.
- 88 -
Essential Terms
IT infrastructure: All parts of importance to the provision of IT services. These include: hardware, software, network and components, documentation, manuals, procedures, air-conditioning, cooling, organisation etc etc. Staff, however, is not a C.I. (according to ITIL, many organisations choose to include staff into their CMDB as soon as the organisation is mature enough to handle this). Configuration item (CI): The IT infrastructure is built from components. Every component of the infrastructure, under the management of the IT organisation, is called configuration item (C.I.). A C.I. may have any form of complexity or size and may vary from a complete mainframe to, for instance, a PC as well as a monitor, keyboard or a floppy disk drive.
Attribute:
An attribute is a characteristic that is recorded about a C.I. Examples of this are: identification number, colour, type, size, manager, user, price, depreciation, supplier, serial number, version, status, etc. Values: A value is the quantifiable part of an attribute. (examples of values are "red", "10", "E9", "critical") Links/Relations: All relationships are normally recorded in a hierarchical structure (such as a parent-child relation). Other relationships are "is linked to" or "is used by". These relationships should be made between C.I.s in the CMDB. Configuration Control: Ensures C.I.s are only changed with the appropriate documentation and that C.I.s are tracked from procurement to disposal. Asset Management: The difference between Asset Management and Configuration Management is that Asset Management has a list of assets and Configuration Management has a database with relationships between C.I.s. Configuration baseline
- 89 -
Release Management
Introduction
With the increasing complexity of systems and a greater need for IT organisations to provide a stable environment, the release of new software and hardware into the business must be closely controlled. Quite often however a poor release strategy leads to the very thing that others in the IT organisation are working hard to avoid; downtime and loss of infrastructure stability. The "Catch 22" however is that there in an ever increasing pressure to have the release sooner, as it will deliver immediate benefits to the organisation. External forces often drive the
demand to get the latest hardware of software into production as businesses strive to be first to market or to help them gain a competitive edge. This process within ITIL aims to provide a structured approach to the management of releases into the infrastructure from release planning through to actual installation. The relationships with Change Management and Configuration Management are key for this process as all three are very closely related. Release Management provides the physical management of software and hardware. Information about the software and hardware components of the IT and their relationships with one another are stored in the Configuration Management Database (CMDB). Release Management manages the planned and applied changes to software and hardware in the IT infrastructure. To support Change Management and Configuration Management, Release Management utilises the Definitive Software Library (DSL) and the Definitive Hardware Storage (DHS). These secured libraries provide the physical storage location of all software Configuration Items (CI's) (DSL) and spare parts for hardware (DHS). Software comes in various forms such as source codes, loads, libraries and executables. The different versions of the same software held in the DSL have been through authorisation and quality controls and are used for the construction and implementation of releases. Spare hardware held will be dependant on a risk assessment (looking at the assets of the organisation and then the threats and vulnerabilities), as well as third party involvement regarding support contracts (Underpinning Contracts). Changes to the production hardware environment must flow through to the DHS, so that any held spares can be compatible with latest production hardware.
- 90 -
Objective
Release Management is the process that "protects" the live or production environment. Protection comes in the form of formal procedures and extensive testing regarding proposed changes to software or hardware within the production environment. Note: The use of the term "Production Environment" conjures up images of a factory or manufacturing facility. However, it is a generic term applied to all areas of infrastructure in use that contribute towards the realisation of organisational objectives. Note: It is the use of this term Production Environment that probably provides an answer to a question that gets raised a lot regarding the ITIL framework. Can the (ITIL) framework be used in other business areas, other than IT ? The answer is most definitely yes. The framework is not a prescriptive set of processes that lack flexibility. They are a set of generic guidelines that, with the right perspective, can be applied just as easily to manufacturing & engineering disciplines. Objectives of Release Management process include: To manage, distribute and implement approved hardware and software items. Provision for physical and secure storage of approved hardware and software items in the Definitive Hardware Store (DHS) and Definitive Software Library (DSL) Ensuring that only authorised and quality controlled software & hardware versions are used in the test and production environments. Note: Even the test environment can be subject to the Release Management process. Many businesses have a very high reliance on their test centres and cannot afford uncontrolled actions, simply because the environment is not the front line of the business.
- 91 -
Process Description
The main components controlled under a good Release Management process include: In-house developed applications, Purchased software and custom built software, Utility applications, Software provided by suppliers for use on specialist systems, Hardware and hardware rollouts Instructions and user manuals
Release Management manages all software and hardware from purchase or development until testing and the eventual migration into production. The process starts with the planning of a new release, be it for hardware or software and ends with a well documented, securely stored, implemented new release with the lowest possible impact on the organisations day-to-day activities. The following diagram illustrates some of the basic before and after situations surrounding the Release Management process.
Activities
The following diagram shows the activities of Release Management and their relationships with the Configuration Management Database (CMDB):
- 92 -
In a list format the activities described in the picture include: Planning and describing the Release Policy of a Release The design, building and configuration of Releases Testing and signing off of new releases Planning the rollout of releases Communication, preparation and training Release, distribution and the installation
Planning and describing the Release Policy The Release Policy will document how the organisation will approach the release of new hardware and software in to the infrastructure. Specified in this policy will be items such as: The frequency of releases that is acceptable to the business A policy on how to issue emergency releases A policy on testing and subsequent release into production
- 93 -
Preparation for any release requires a structured planning approach to increase the chance of success. The use of a formal project management methodology like PRINCE2 will assist in this to define items such as: Contents of the Release A release schedule Resource requirements Roles and responsibilities Project Approach Definition of the release components Back up plan Quality plan Acceptance plan Note: PRINCE2 is an acronym for PRojects IN Controlled Environments. Like ITIL PRINCE2 is a methodology or framework, not a tool for Project Management. PRINCE2 is published by the same body that publishes ITIL (the Office of Government Commerce (OGC) in the UK). And like ITIL it is a widely accepted framework for best practice. You can find extra information on PRINCE2 at: http://www.ogc.gov.uk/prince/
The Design, Building and Configuration of Releases This activity within Release Management can be considered as the technical stage of the process. All the actions associated with designing, configuration and building are completed by relevant staff, in a "controlled" manner. Note: The term "controlled" is not intended to create an image of bureaucracy. Controlled environments are reproducible and that is the critical issue here. At the end of this stage a Back-Out Plan should also have been created. Back-Out plans can be aimed at restoring all services to their state before any change OR to restore as close to the pre-change state as is feasible given the nature of the change. The suitability and content of the Back-Out plan will be assessed during the Change Management process. The output of this activity should be a release complete with instructions on its installation, a test plan and a backout plan.
- 94 -
Communication, Preparation and Training It is important to communicate with all parties involved in order to increase the acceptance and success of the release. This might involve several meetings/training sessions with user groups, IT staff and Managers. The timing of any training and/or communication must be planned in accordance with the expected actual release date. The Service Desk is a key area that must be informed about the release, any known issues (or workarounds) that have been established during testing and generally how the new release should be supported. The release plan should be made public in case of a large release so users know what to expect and when.
- 95 -
Roles
The main role within the Release Management process is that of the Release Manager. This person is responsible for defining and maintaining the definition of the release policy and controlling the activities within the process. The Release Manager will have a good technical background and a good knowledge regarding latest utilities and support tools. The combination of roles is permissible for certain ITIL processes. In smaller IT organisations the combination of the Release Manager, Change Management and Configuration Management processes is realistic. Release Management staff will need to receive technical training for development, software maintenance and hardware build skills. Project Management is another essential characteristic that is evident in a Release Management environment.
Relationships
Release Management is very closely linked with Change Management and Configuration Management. Change Management controls all the changes and determines when a new release will be implemented and what changes will be in any release. In most major organisations a representative for the Release Management process will have a representative in the Change Advisory Board (CAB). Note: The CAB is the authorising body for changes to proceed. Can you remember the term given to the group of people who approve Emergency Changes? Configuration Management needs to be informed by Release Management about every change to a Configuration Item (C.I.) so they can update the CMDB. They also need to make sure that new versions of software and hardware are being stored in the DSL or DHS. Release Management will use Configuration Management to get information about C.I.'s that may be affected by a new release and importantly their relationship with other C.I.s.
- 96 -
Benefits Implementation advantages: of the ITIL Release Management process provides the following
Software is being released for testing and production in a controlled manner, reducing the chance of errors. The software (source, loads and executables) of the organisation is held in a secure location (the Definitive Software Library (DSL)). Ability to implement many concurrent changes in the software being used in the production environment without adversely affecting the quality of the IT environment. Software in remote locations can be managed effectively and economically from a central point. The possibility of the use of illegal copies is dramatically reduced. The impact of new hardware is tested prior to installation in the infrastructure. With end users more informed of new releases and involved in testing the new releases the risk of resistance will be reduced significantly.
- 97 -
Common Problems
In order for Release Management to be successful the following issues need to be taken in consideration as they may cause problems: Lack of Commitment: End Users might be reluctant at first to be told by this process how to act in case of a new release. The advantage of this process needs to be communicated before the process is implemented. Urgent fixes. Procedures need to be in place to make sure that they are dealt with correctly and dont compromise the accuracy of the CMDB, DSL or DHS. Testing. A proper tests environment needs to be available in order to assess the impact and reduce the risk of a new release. Bypassing of the process my cause illegal software or viruses to enter the IT Infrastructure. Regular audits can help to minimise this potential issue.
- 98 -
Metrics
In order to assess the effectiveness of the Release Management process a number of key performance indicators (KPIs) should be monitored. Examples of possible indicators are: Releases built and implemented on schedule, and within budgeted resources Number of Releases that result in a back out due to unacceptable errors Number of Incidents caused by the release Outcome of audits of the DSL and the DHS. Security - all software can be accounted for etc. Proper use of the DSL for software development Compliance with all legal restrictions relating to purchased software Accurate and timely recording of all build, distribution and implementation activities within the CMDB
Best practices
Different Types of Releases ITIL defines three different release types as described.
- 99 -
- 100 -
Essential Terms
Definitive Software Library: A library, which stores in their definitive, accepted form, all the versions of software configuration items that have been accepted from the developer or supplier. This logical library can be physically present in several locations. Definitive Hardware Store: A physical secure storage of definitive hardware spares. These are spare components and assemblies that are maintained at the same level as the comparative systems within the live environment. Release: A software CI introduced into the test environment and subsequently into the production environment. In most cases documentation and accompanying hardware also are part of the release. Release Unit: A Release Unit includes the CIs that can be released together. Type of Releases: Full Release: all components of a release unit are built, tested, distributed and implemented together. Delta Release: or partial release, only those CI's in the Release Unit that have actually changed since the last delta or full release. Package Release: combination of full releases and delta releases (helps to reduce the risk of incompatible releases, by changing many systems concurrently).
- 101 -
Imagine the following situation: The Managing Director (MD) announces to the Chief Information Officer (CIO) that the company is thinking about outsourcing the IT Organisation. Over the last two years there have been numerous and major complaints about the current IT services by the business. The customers say it doesnt do what it should, it is not working properly etc. The CIO is puzzled to say the least. He had no idea that they were doing so badly. They actually thought they were doing well. The services were up and running for most of the time. They resolve incidents quickly and didnt get many complaints from the users that they were aware of. His staff have been putting in an enormous amount of effort to upgrade the server that the payroll system has been running on. How could we have done any better? See the problem?
1. The IT Organisation thinks it is delivering services of a high standard but they have 2. 3.
no figures to back that up. The loosely defined "up and running most of the time" didn't take into account the outages during critical times. The effort on upgrading the server is commendable, but of no benefit as the business recently decided to outsource the company payroll activities. There probably is no official procedure in place to ask for the Customers opinion or a how to make a complaint so how could they have know about the perception of the customer regarding there services?
- 102 -
Objective
The Service Level Management process manages the quality of IT service delivery according to a written agreement between the users and IT department called the Service Level Agreements (SLAs). The goal for SLM is to maintain and improve on service quality through a constant cycle of agreeing, monitoring, reporting and improving the current levels of service. It is strategically focused on the business and maintaining the alignment between the business and IT.
- 103 -
Process Description
In order to understand the Service Level Management process it is necessary to understand some basic concepts that are used, we will be explaining them here so the process is easier to understand. Service Level Requirements (SLR) This is a document that contains customer requirements regarding which IT services they want and the availability/performance they need for those services. This is the starting point of setting up the Service Level Agreements. Service Specifications The IT organisation draws up the Service Specifications based on the SLR. This is the translation of the customer requirements into "how" the IT organisation is going to provide these services. What are the technical needs? It also will show relationships between the SLAs, the third parties and IT the organisation itself. Service Level Agreement (SLA) The SLA is a document that defines agreed service levels between the customer and provider, i.e. between IT and the business. SLAs should be written in language that the business understands (clear, concise and free of jargon). SLAs should not include detailed procedure diagrams for other processes or content such as technical information that the business will not understand.
Underpinning Contracts (UPCs or UCs) With the an external supplier or third party is involved in the delivery of IT Services then a contract has to be drawn up to ensure that they provide their service within a certain
- 104 -
- 105 -
Activities
The main activities of Service Level Management consist of: Composing a Service Catalogue Negotiating with clients over the possibilities and price of automation and drafting Securing and maintaining the Service Level Agreement (SLA).
This is done trough a continuous cycle of the following actions: Identifying Defining Negotiating Monitoring Reporting Reviewing
Identifying Within this activity the IT organisation will need to define the services it provides within a Service Catalogue. The Service Catalogue is like a menu of services that will clarify for IT what is on offer and the components of these services.
- 106 -
Defining The results of this activity the first time around will be the delivering of the SLR, the service specs and the SQP. On an ongoing basis this activity will include taking the SLRs as well as the content of the Service Catalogue and defining a draft SLA that aligns both into acceptable service levels. During the creation of this document consideration of the UCs and OLAs is critical as these documents support the SLA. Later on the needs of the customer and the specs need to be verified on a regular basis as they might change. The needs of the customer might change due to a change in the business procedures and the specs made need changing as a result of the changed Requirements or the introduction of advanced technology.
Negotiating Once the draft SLA is formulated negotiation is carried out to gain agreement, acceptance and a signature for the following documents. Service Level Agreement Underpinning Contracts Operational Level Agreements
It is critical that the above documents are negotiated and signed off.
Monitoring If service levels cannot be measured and monitored their value is substantially reduced. Why set service levels if you do not know if they are being met? In order to be able to measure the service levels they need to be clear and have to be objective. It is not enough to define how much time a service can be unavailable, one also needs to define when a service is said to be available again. Is it when the IT organisation restored the service or when the users are aware of it? In order to monitor the performance, availability and support service levels other processes such as Capacity, Availability and Incident Management should be in place. These processes will manage and report on service levels reporting back to the Service Level Management process.
- 107 -
Reporting The reports should show the figures about the service levels that are required and the actually measured service levels. Subjects that can be included are: Time needed to resolve Incidents Down time of the network and any other occasion where the service levels have not been met. Time needed for a change. All major disruptions to the IT service in detail. Use of the capacity (minimum and maximum) Amount of interactions with various services
Reviewing Reviewing the Service with the Customers on a regular basis will help discover opportunities to improve the IT service that is provided. With the help of a Service Improvement Plan (SIP) this can be achieved. Once the Service Level Agreements are documented is it not end of the process it is the start! It is also important to regularly review how the process itself operates and update where necessary.
Roles
Service Level Manager Role The Service Level Manager is responsible for the implementing of the process and maintaining or improving the Service Levels by initiating improvement actions. The role requires a position that allows the person to negotiate the service levels with the customers on behalf of the IT organisation. The Service Level Manger oversees the steps that result in the following official documents:
- 108 -
Relationships
Service Level Management is at once the basis and, the result of, the implementation of Service Management processes. Service Level Management is related to every other module within Service Management. You cant implement Service Level Management to a full maturity without the other nine processes and the Service Desk function, due to the holistic approach required for Service Management. The Service Support processes - Incident and Problem and the Service Desk - aim to restore the services as soon as possible when there is a breach within the Service Levels. They provide SLM with valuable information as the customers perception of the Service Levels. The Service Delivery processes are more focussed on keeping the services running within the parameters defined in the SLAs. They get information from SLM about the required levels and give information about the actual levels and advice about the impact of new or changed services.
- 109 -
Benefits
Introducing Service Level Management will have the following benefits for the Business and the IT organisation: The IT service will be of a higher quality and will cause less interruption. Hence the productivity of the IT customers will improve as well. The resources of IT staff will be used more efficiently. The IT organisation will provide services that meet the expectations of the customers. The service provided can be measured. The perception of the IT organisation will improve. Cost reduction.
- 110 -
Common Problems
- 111 -
The following issues need to be addressed in order to ensure a successful Service Level Management process: The service levels set out in the SLAs must be achievable for the IT Organisation in the first place. UCs and or OLAs much be set up properly otherwise external suppliers or internal parties may inadvertently create a breach of the agreed Service Levels. The services need to be measurable and objective for IT Customers and the IT organisation. There needs to be a commitment to negotiate the Service Levels required and in the drawing up Service Level Agreements. This must be backed with a commitment to conduct regular reviews and not simply let the agreements get outdated.
Note: A useful acronym when thinking about Service Level Agreements (or any contract) is SMART Simple Measurable Achievable Realistic Time driven
- 112 -
Metrics
The following question will help determine if the Service Level Management process is effective and efficient: Are all services covered by SLAs? Do the services within the SLAs have the necessary UCs and or OLAs? Is there an improvement in the Service Levels? Are the actual Service Levels measured? Is the perception of the IT organisation improving?
Best practices
Interesting websites: Assessment http://www.itil.co.uk/online_ordering/serv_del_graphs/servlevel_mngt.htm
- 113 -
Essentials (terminology)
Service Level Management: The process of negotiating, defining, drafting, securing and revising a demanded and cost justified level of service delivery to the user. Service Level Requirements Business demands and requirements for service levels. Examples are: downtime, availability % and opening hours of the helpdesk. Together with the service catalogue they can be input for the SLA negotiations. Service Catalogue Overview of all current services as delivered by IT. May have a price list attached. SIP = Service Improvement Program / Plan Actions, phases and due dates for the improvement of the services SQP = Service Quality Program / Plan A document that contains all the information for the managers, including the Performance Indicators for the other ITIL processes.
- 114 -
Financial Management
Introduction
Over recent years modern businesses have become more and more dependent on IT to operate their business processes efficiently. As a consequence the number of end users drastically increased and so did the total amount of money spent on IT (the IT budgets). All too often customers of IT organisations and senior managers often perceive that there is too much money spent on IT. This has, therefore, led to a demand for increasingly higher quality and cost-effectiveness of the provided services. The IT organisation on the other hand is often under the impression that they are doing a good job, but find it very difficult to clearly explain in business language what the real costs and benefits are of the provided IT Services. Organisations (the Customers and senior managers) are reluctant to spend money on improving IT services if they dont have a clear picture of the costs involved and the benefits it has for the business. Financial Management for IT Services can make the costs clear, set up a charging method and give customers an idea about the quality / price relation. In other words, Financial Management for IT Services promotes the running of IT Services as a business operation.
The slide below shows some common thoughts and remarks often heard in organisations through out the world.
- 115 -
Objective
- 116 -
The objective of the Financial Management for IT Services process for an in-house IT organisation should be: to provide cost-effective stewardship of the IT assets and resources used in providing IT Services.
In a commercial environment, there may be additional statements that reflects the profitmaking and marketing aims of the organisation, but for any IT Services organisation the objectives should include: to be able to account fully for the spend on IT Services and to attribute these costs to the services delivered to the organisations Customers. to assist management decisions on IT investments by providing detailed business cases for Changes to IT Services.
The main focus of this process, therefore, is on understanding the costs involved in delivering IT Services (by attributing the costs to each specific IT Service and Customer). This awareness of costs improves the quality of all decisions made in regards to IT expenditure. Charging (notional or by sending real bills) the costs to the Customers is optional.
- 117 -
Process Description
Financial Management for IT Services consists of the following three sub-processes: Budgeting (mandatory) Budgeting is the process of predicting and controlling the spending of money within the organisation and consists of a periodic negotiation cycle to set budgets (usually annual) and the day-to-day monitoring of current budgets. Budgeting ensures that the correct finance is available for the provision of IT Services and that during the budget period they are not over-spent. All organisations have a periodic (e.g. annual) round of negotiations between the business departments and the IT organisation covering expenditure plans and agreed investment programs which ultimately sets the budgets for IT. IT Accounting (mandatory) IT Accounting is the set of processes that enable the IT organisation to account fully for the way its money is spent (particularly the ability to identify costs by Customer, by service and by activity). It is in this regard more important to understand the costs than to know up to the dollar cent how much something costs. Charging (optional) Charging is the set of processes required to bill Customers for the services supplied to them. This requires sound IT Accounting and needs to be done in a simple, fair and accurate way. Within organisations there are two distinct cycles associated with Budgeting, IT Accounting and Charging: A planning cycle (annual) where cost projections and workload forecasting form a basis for cost calculations and price setting. An operational cycle (monthly or quarterly) where costs are monitored and checked against budgets, bills are issued and revenue collected.
All these processes are discussed in more detail in the following chapter.
- 118 -
Activities
Each of the three sub-processes of Financial Management for IT Services consist of a set of activities, which will be discussed in this chapter. Budgeting Determine the budget method: o Incremental budgeting Last years figures are used as the foundations for this years budget o Zero-based budgeting A fresh start: Called the Zero base. The purpose and need of every expense needs to be determined, together with the actual amount. Determine the budget period In most cases this will be the period of a financial (fiscal) year which can we subdivided in smaller periods. Set up the budget Determine all the categories available and estimate the costs for the next budget period. Take in consideration that demand might increase over time. Some costs might need to be estimated.
- 119 -
IT Accounting
IT Accounting aims to provide the information about what the money was spent on. All Configuration Items necessary to deliver an IT Service to the Customer bear a certain cost. These costs together add up to the total costs necessary for IT Service delivery. In order to understand costs we need to discuss costs in a general way.
Direct or Indirect costs. Direct costs are costs that can be assigned to specific services. For example the cost of a printer that is used by one department only can be seen direct costs. Indirect costs are costs that cant be related to a certain service. For example the electricity of the IT department they are also called shared costs.
Capital versus Operational costs Capital costs are costs involved with the purchasing of items that will be used over a few years and need to be depreciated. (The depreciation amount is part of the total costs). Operational costs are those resulting from the day-to-day running of the IT Services organisation (e.g. staff costs, electricity, hardware maintenance) and relate to repeating payments whose effects can be measured within a short timeframe (usually less than 12 months).
- 120 -
Fixed or variable costs Fixed costs are costs that stay the same regardless. The rent of a building is an example of fixed costs. Variable costs change with the use of the service. If you take a telephone service as an example the line rental costs are fixed, as they will be the same each month regardless how many calls you make. The costs for the calls are variable costs as they depend on the amount of calls are made.
Cost types Cost types need to be determined (they are also used in the budgeting activity). The main cost types are Hardware, Software, People, Accommodation, Transfer and External Service costs.
Depreciation methods Capital costs are depreciated over the useful live of the fixed asset (e.g. desktops in three years, the mainframe in ten years). There are three methods of depreciation:
Straight-line method. An equal amount is written off the value of the asset each year. Reducing balance method. A percentage of the capital cost is written off the net book value each year. Depreciation by usage The Depreciation is written-off to the extent of usage during a period.
In most cases there already will be an accounting model in place for the rest of the business. It is important to define a Cost Model that complies with the overall business accounting model.
- 121 -
Charging In a Profit Centre the objective is to recover, through Charging, an amount greater than the costs incurred. For an in-house IT organisation the aim could be to recover the costs back in a fair and simple way. But it could also be just to influence the behaviour of the Customers and end users. That is, via Charging the IT organisation can influence the demand and actual usage of IT Services, and the way the service are provided. Before Charging can take place a few decisions need to be made regarding the Charging policy, Cost Units and Pricing. A Charging policy needs to be chosen: Communication of information Only the actual costs will be calculated, reported and charged to the customer (plus specific amounts, if this is agreed with the customer). Pricing flexibility Establish and charge the prices each year. This method gives the option to influence excessive use. Notional charging All costs are invoiced, but the customer doesnt have to pay the physical dollars. This method is used to gain experience and eliminate mistakes. In order to be able to charge the costs to the IT customers Cost Units or chargeable items need to be set up. These have to be clear, so they can be checked and are understood by the Customer. Examples would the PC they use, the amount of print jobs they request. Pricing moves the budget responsibility from the IT department to the user organisation. A good pricing system gives the client the sense that they get value for their money.
- 122 -
Roles
The IT Finance Manager can be a person from the IT organisation or the Finance department. An alternative would be that the tasks, associated with this role, are shared between both. The main responsibilities are: To oversee the implementation of the Financial Management for IT Services process and their sub processes (Budgeting, IT Accounting and Charging). Assist in setting up the budgets and the accounting plans.
- 123 -
If the IT Finance Manger is from the IT organisation maintaining a close relationship with the Finance Department becomes one of the responsibilities.
Relationships
Financial Management for IT Services provides (depending on which pricing system has been chosen within the organisation), important information to Service Level Management about the introduced costing, pricing and charging strategies. As well as this the Financial Management process analyses which level of service delivery is technically cost realistic for the business. Financial Management for IT services can, together with Capacity Management and Availability Management, develop pricing strategies. These strategies can realise an optimal spread of the workload within an organisation, which will result in optimal use of resources. It can also use asset and cost information from Configuration Management to analyse different scenarios of equipment (different costs for different configurations).
Benefits
The benefits of implementing the Financial Management for IT Services process include: Increased confidence in setting and managing budgets. A more efficient use of IT resources throughout the organisation. Higher satisfaction of Customers as they know what they are paying for. Investment decisions can be made on accurate information. Increased professionalism of staff within the IT organisation.
Divided per sub process the benefits will be: For budgeting: Ability to estimate the total costs needed to run the IT organisation. Reducing the risk of spending more money then is available. The ability to verify if the actual costs compare to the estimate costs.
- 124 -
For IT Accounting: Availability of management information on the costs of providing IT Services. IT and Business managers make better decisions, which ensures that the IT Services organisation runs in a cost-effective manner. Ability to accurately account for all the expenses made by the IT organisation. Demonstrate under- or over-consumption of services in financial terms. Maximising the value for money of the provided IT services. The possibility to determine the costs of NOT making a specific investment. Form the basis to implement Charging.
For Charging: Providing a sound business method of balancing the shape and quantity of IT Services with the needs and resources of the Customer. The ability to recover IT costs in a fair manner. Influence the demand for the provided IT Services; hence influence the behaviour of the Customer.
Common Problems
In order for the process to work effective and efficiently the following issues should be addressed as they are typical areas that can cause problems in this process area; Cost Models that are used for IT Accounting are too detailed, creating too much administrative overhead. Not enough commitment from senior IT and Business management. Financial Management for IT Services is not in alignment with the way the overall organisation manages its finances. Charging policies are not well communicated to customers possibly causing unwanted behaviour (eg. actions by users/customers to try to avoid incurring charges).
- 125 -
Metrics
Key performance indicators to report on the process are:
Accurate costbenefit analysis of the services provided Customers consider the charging methods reasonable The IT organization meets its financial targets The use of the services by the customer changes Timely reporting to Service Level Management
Note: Accountancy is a respected profession around the world. It is very unlikely that this process can be properly implemented without some specific expertise in accounting.
- 126 -
Best practices
Essentials (terminology)
Financial Management for IT services: The implementation of Financial Management for IT services is the foundation for an independent IT organisation, which is not only aware of costs, but is also oriented on future investments. Budgeting: The process of predicting and controlling the spending of money within the organisation and consists of a periodic negotiation cycle to set budgets and the day-to-day monitoring of the current budgets. IT Accounting: The set of processes that enables the IT organisation to account fully for the way its money is spent.
- 127 -
Availability Management
Introduction
Organisations are increasingly dependent on IT services, when they are unavailable, in most cases the business stops as well. There is also an increasing demand for 7 days per week 24 hrs a day availability of IT services. It is therefore vital for the IT organisation to manage and control the availability of the IT Services. This is done by defining the requirements from the business regarding the availability of the IT services and then matching them with the possibilities of the IT organisation.
- 128 -
Objective
The objective of Availability Management is to get a clear picture of business requirements regarding IT Services availability and then optimise infrastructure capabilities to align with these needs. Or one can put it this way: To ensure the highest availability possible of the IT services as required by the business to reach its goals.
- 129 -
Process Description
Availability Management depends on a lot of inputs to be able to function well. Among the inputs are: The requirements regarding the availability of the business Information regarding reliability, maintainability, recoverability and serviceability of the CIs Information from the other processes, Incidents, Problems, SLAs and achieved service levels
The outputs of the process are: Recommendation regarding the IT infrastructure to ensure the resilience of the IT infrastructure Reports about the availability of the services Procedures to ensure availability and recovery are dealt with for every new or improved IT service. Plans to improve the Availability of the IT services
- 130 -
A = (ST - DT)/ST x 100, whereby A - Availability, ST = agreed Service Time and DT = Down Time. Availability is defined as: Availability of a IT Service or component to perform its required function at a stated instant or over a stated period (ITIL Service Delivery Book, OGC,2001) Reliability: The reliability of components of the infrastructure. In this case the Mean Time Between Failures (MTBF) can be used as a measuring tool. Reliability is defined as: freedom from operational failure (ITIL Service Delivery Book, OGC, 2001) Resilience is a key aspect of reliability Resilience is defined as: The ability of an IT component to continue to operate even though one or more of its sub components has failed Maintainability: The capability to maintain or restore a service or component of the infrastructure at a certain level, so that the required functionality can be delivered. Some services or indeed components of the infrastructure are easier to maintain and/or restore to service in the event of a failure. For example, an application has been developed that requires daily housekeeping to ensure its operation and a highly qualified Database Administrator can only do this. This application is not easy to maintain. The maintainability of C.I.'s within the infrastructure is an important consideration as the speed of recovery and the ease of maintenance will impact the uptime and hence availability of services. Operational Level agreements (OLA's) within the Service Level Management process tie in here. Note: Remember that C.I. = Configuration Item Serviceability: Serviceability refers to the agreements that are held with third parties providing services to the IT organisation. These contracts will define how these external parties will perform to ensure the availability of the services they interface with. For example, how will they ensure resilience, how will they maintain the infrastructure they are responsible for. Underpinning Contracts within Service Level Management tie in here. Security: This is divided into confidentiality, integrity and availability (CIA). It can be desirable (for security reasons, which might endanger the availability) not to make certain components of the infrastructure available, logically or physically.
- 131 -
Activities
The activities within the process can be divided in three main activities, which will be discussed in detail in the remainder of this chapter: Planning Improving Measuring and reporting
Planning Planning involves the following activities: Determine the Availability Requirements It is important not only to find out the requirements but also to find out if and how the IT organisation can meet these requirements. The Service Level Management process maintains contact with the business and will be able to provide the availability expectations
- 132 -
Material do Curso de ITIL to the Availability Management process. The business may have unrealistic expectations with
respect to availability without understanding what this means in real terms. For example, they may want 99.9% availability yet not realise that this will cost five times more than providing 98% availability, for their organisations infrastructure. It is the responsibility of Service Level Management and the Availability Management process to manage expectations. Design When considering the design of the infrastructure the IT organisation can either design for availability or recovery. Design for Availability When the business cannot afford for particular service/s to have downtime for any length of time designing the infrastructure for availability should be the approach. In this instance the IT organisation will need to build resilience into the infrastructure and ensure that preventative maintenance can be performed to maintain services in operation. In many cases building extra availability into the infrastructure is an expensive task that must be justified by business need. Designing for Availability is considered a pro-active approach to avoiding downtime in IT services. Design for Recovery When the business can tolerate some downtime of services or the cost justification cannot be made for building in additional resilience into the infrastructure then designing for recovery is the appropriate approach. In this approach the infrastructure will be designed such that in the event of a service failure recovery will be as fast as possible. Spare part for example will assist in the speedy of infrastructure components that fail. Designing for recovery can be seen as more reactive management of availability. The processes (like Incident Management) need to be in place to recover as soon as possible in case of a service interruption. Other Considerations Security issues
Define the security areas and the impact they might have on the availability of services. Make sure it is clear who has access to what and where. Maintenance management
This is a maintenance window that is agreed upon and known to the customers in which the IT organisation can do the maintenance and repairs. This way the impact on the IT service of the maintenance and repairs will be reduced.
- 133 -
Measuring and reporting This involves reporting about the availability of each service, the down times and recovery times. These reports will often go to the Service Level Management process to use in reporting comparisons (planned versus actual) on service levels back to the customer. It is also important to measure and report on the perception of the customers on the availability of the IT service. You can use many ways to identify (un-) availability and potential problems. The following are a few mentioned by the OGC: CFIA Component Failure Impact Assessment can be used to predict and evaluate the impact on IT Service arising from component failures within the IT infrastructure. FTA Fault Tree Analysis is a technique that can be used to determine the chain of events that causes a disruption to IT services. CRAMM CCTA Risk Analysis and Management Methodology can be used to identify new risks and provide appropriate countermeasures associated with any change to the business availability requirements and revised IT infrastructure design.
SOA Systems Outage Analysis is a technique designed to provide a structured approach to identifying the underlying causes of service interruption to the user.
Roles
The Availability Manager The Availability Manager has a guiding role and has a general, yet sound knowledge of the IT infrastructure. They will assemble and analyse data from processes like Problem Management, Change Management, Service Desk and Capacity Management to assist in management and planning with regard to availability. Using the results of this data they steer other Service Management processes in order to guarantee the agreed availability, thus helping to prevent problems. For example they may attend Change Advisory Board meetings within Change Management.
- 134 -
Relationships
The introduction of Availability Management without the other processes in place is likely to fail, as without the support of the other processes it cant deliver the agreed availability. Incident Management and Problem Management provide a key input to ensure the appropriate corrective actions are being progressed. The measurements and reporting of IT availability ensures that the level of availability delivered, meets the Service Level Agreement (SLA). Availability Management supports the Service Level Management process in providing measurements and reporting to support service reviews.
- 135 -
Benefits
The main benefit is: Optimal use of the capability of the IT Infrastructure and delivering the availability of the IT services that is according the agreed requirements of the customers. Other benefits include: Constant striving to improve the availability Higher Customer Satisfaction In case of a disruption corrective action will be undertaken Higher availability of the IT service
- 136 -
Common Problems
As with every process there are some issues that need to be addressed as they can make or break the success of the process. For Availability management they are: Unclear requirements from the business regarding the availability expected of the IT service No official contract is drawn up to specify the agreed availability of each service. (Risk of conflict and arguments at a later stage) Commitment to the process
The business and the IT organisation must share a common understanding on the definition of availability and the definition of downtime.
Metrics
By reporting on the following items the effectiveness and efficiency of the process can be measured:
The total downtime per service
- 137 -
Note: Check the AV Essentials section for defined availability terms (eg. MTBF, MTTR, etc.)
Best practices
Interesting websites: Assesment http://www.itil.co.uk/online_ordering/serv_del_graphs/avail_mngt.htm Essentials (terminology)
Down time: The total period during which an IT service is not operational within the agreed service times. Mean Time Between Failures (MTBF): The average period between the first moment the service is fully operational and the moment that this service is no longer operational.
- 138 -
Material do Curso de ITIL Mean Time To Repair (MTTR): The average period between commencement of an incident and its solution. Mean Time to Restore Services (MTRS): The average period between the commencement of an incident and the restoration of the service delivery. Fault, Failure: The moment at which a functional unit no longer provides the required function. High Availability A characteristic of the IT Service that masks the effects of IT component failure to the user. Continuous Operation A characteristic of the IT operation that masks the effects of planned downtime to the user.
Continuous Availability A characteristic of the IT Service that minimises or masks the effects of all failures and planned downtime to the user.
- 139 -
Capacity Management
Introduction
The Capacity Management process is designed to ensure that the capacity of the IT infrastructure is aligned to business needs. The main purpose of Capacity Management is to understand and maintain the required level of service delivery (via the appropriate capacity) - at an acceptable cost. Through gathering business and technical capacity data this process plans for and delivers the, cost justified, capacity requirements of the business. The Capacity plan is the core document that describes how this will take place over the coming period.
Objective
The main objective of Capacity Management is to understand the businesss capacity requirements and deliver against them both in the present and the future. Capacity Management is also responsible for understanding potential advantages new technology could have and assessing its suitability for the organisation.
- 140 -
Process Description
The Capacity Management process breaks down into three sub-processes listed below:
Business Capacity Management This sub process has the focus on the long term. It is responsible for ensuring that the future business requirements are taken into consideration then planned and implemented as necessary. Service Capacity Management Is responsible for ensuring that the performance of all current IT services falls within the parameters detailed as targets within SLAs. Resource Capacity Management Is responsible for the management of the individual components within the infrastructure. Resource capacity management has more of a technical focus.
- 141 -
Activities
Each of the sub process mentioned before involve, to a higher or lesser degree, the following activities:
- 142 -
Iterative Activities The following iterative activities take place within Capacity Management: Monitoring; checking if all the service levels are being met. Analysis; the data collected by monitoring needs to be analysed and predictions been made for the future Tuning; implement the outcomes and results of the 2 previous steps to ensure optimal use of the infrastructure for now and in the future. Implementation; to actual implement new capacity or changed capacity with Change Management.
Storage of Capacity Management Data The Capacity Database (CDB) is the cornerstone of the process. It is used to form the basis of the reports for this process and contains technical, business and relevant information for capacity Management. As well, the information contained here provides the other processes with the data necessary for their analysis. Demand management: Demand Management is responsible for the management or workload in the infrastructure to better utilise the current capacity rather than increasing capacity. User behaviour is influenced to shift workload, for example to another time of the day to relieve capacity shortages. Application Sizing: Application Sizing related to the assessment of the capacity requirements of applications during their planning and development. The capacity requirements of a new application will be understood and the infrastructure can be tuned as necessary to meet the new requirements. Modeling: By simulation or with assistance of mathematical models modeling allows for the prediction of future capacity requirements. The results from this can be used as an input into the Capacity Plan. Capacity Plan Capacity Plan, this plan is drafted on the basis of data from the CDB (Capacity Database), financial data, business data, technical data, etc. The plan is future oriented and covers a period of at least 12 months. Reporting Reporting entails the reporting of capacity performance over any given period. Reporting for example could be (but not limited to) against the capacity metrics in SLAs.
- 143 -
Roles
The Capacity Manager The main responsibilities of the capacity manager are: To develop and maintain the Capacity plan Manage the process Make sure the Capacity database is up to date.
To do this, the manager must be involved in evaluating all changes, to establish the effect on capacity and performance. This should happen both when changes are proposed and after they are implemented. They pay particular attention to the cumulative effect of changes over a period of time. The cumulative effects of single changes can often cause degraded response times, file storage problems, and excess demand for processing capacity. Other roles within Capacity Management are the roles of the network manager, application and system manger. They are responsible for translating the business requirements in to required capacity to be able to meet these requirements and to optimise the performance.
- 144 -
Relationships
Capacity Management is part of Service Delivery and is directly related to the business requirements. It is not simply about the performance of the systems components, individually or collectively. Service Desk, Incident Management and Problem Management. These processes will provide Capacity Management with information about incidents and problems related to Capacity. Capacity Management will support the processes with solving the incidents and or problems and also provide them with information about the capacity performance. Change Management and Release Management Capacity Management activities will raise Request for Changes (RFC's) in order to ensure that the appropriate capacity is available. These are subject to the Change Management process, and implementation may affect several Configuration Items (C.I.'s), including hardware, software and documentation, and will require effective Release Management. Availability Management The link between Capacity Management and Availability Management is strong, as the availability that is needed requires a certain amount of capacity within the configuration items. Without enough capacity, you will never have enough availability. Furthermore, the values measured by Capacity Management are of importance to Availability Management in relation to availability and reliability. Service Level Management Both Capacity Management and Availability Management need to provide the service level manager with input for effective SLA negotiations. Capacity Management informs Service Level Management about the result levels that can be provided to the client. Financial Management The drafted capacity plan delivers important input for Financial Management, which on this basis can draft a very accurate investment plan capacity Management gets information in return about the available budget. IT Service Continuity Management Capacity Management provide ITSCM with the information about the minimum required Capacity needed for recovery. It is important to consider the impact (for needed capacity) of changes to the IT services on the ITSCM procedures.
- 145 -
- 146 -
Benefits
Implementation of Capacity Management offers the following benefits: An actual overview of the current capacity in place The possibility to plan capacity in advance. Being able to estimate the impact of new applications or modifications Cost savings Better service that is in tune with the requirements of the Business.
Common Problems
Common problems that can be encountered while the process is already implemented include:
- 147 -
Note: This final point is important. All too often end users and customers are interviewed at length about their expected capacity requirements, only to demand more as soon as the new application goes live. It is up to the IT professionals to have built in the ability for the application to scale to match any new requirements.
Metrics
- 148 -
Best practices
Why a Capacity Plan? With cheap hardware prices, capacity planning may seem unimportant; you can always upgrade later. A simple guess of the capacity requirements should be sufficient, right? Why give this subject any more thought?
There are two main issues that make capacity planning critical. The first is the rate of technical change. We now measure progress in "Internet years" -equivalent to about 90 days of a calendar year. The second is with Internet/Intranet at the helm. Todays systems are primarily being developed within a 3-tier architecture. This rapid change, coupled with the increase in complexity of 3-tier architecture, is causing system designers to pay closer attention to capacity. Five years ago, a designer could roll out a new system with a rough estimate of capacity and performance. The system could then be tuned or more capacity added before all of the users had been converted to the new system. The process was reasonable because the systems were typically not mission-critical. Today, theres no time for this approach. Once systems are in place they become an integral part of the overall design. Downing the system for upgrades becomes increasingly expensive in both time and resources. In addition, the added complexity of the environment typically requires more care, due to the interdependency between various application components. Capacity planning is driven purely by financial considerations. Proper capacity planning can significantly reduce the overall cost of ownership of a system. Although formal capacity planning takes time, internal and external staff resources, software and hardware tools, the potential losses incurred without capacity planning are staggering. Lost productivity of end users in critical business functions, overpaying for network equipment or services and the costs of upgrading systems already in production more than justify the cost of capacity planning.
Assessment - 149 -
Essentials (terminology)
Capacity Plan, A plan that is drafted on the basis of data from the CDB, financial data, business data, technical data, etc. The plan is future oriented and looks forward for a period of at least two years.
Performance Management This is the monitoring of results, signaling of trends, analysis of information and tuning (e.g. by spread of workloads).
Workload Management The identification and registration of use of resources of each workload and the detection of peaks and patterns. Resource Management The optimisation of the use of resources.
- 150 -
There are still quite a few managers that see IT Service Continuity Management (ITSCM) as a luxury for which they do not have to allocate any resources. However, statistics show that disasters regularly occur. Causes of such disasters are events like fire, lighting, flood, burglary, vandalism, power failure or even terrorist attacks. Thinking about - and actually establishing - a Business Continuity Plan could have saved affected companies a lot of troubles or even their business itself. As businesses are becoming increasingly dependent on IT, the impact of the unavailability of IT Services has drastically increased. Every time the availability or performance of a service is reduced, the users cannot continue with their normal work. This trend towards dependency on IT will continue and will increasingly influence users, managers and decision-makers. That is why it is important, that the impact of a total or partial loss of the IT Services is estimated and Continuity Plans established to ensure that the business will always be able to continue.
Objective
The objective of the ITSCM process is to support the overall Business Continuity Management (BCM) process by ensuring that the required IT technical and services facilities can be recovered within required and agreed business time-scales.
- 151 -
Process Description
ITSCM is concerned with managing an organisations ability to continue to provide a predetermined and agreed level of IT services to support the minimum business requirements, following an interruption to the business. This includes:
Ensuring business survival by reducing the impact of a disaster or major failure. Reducing the vulnerability and risk to the business by effective risk analysis and risk management. Preventing the loss of Customer and User confidence. Producing IT recovery plans that are integrated with and fully support the organisations overall Business Continuity Management (BCM) Plan.
ITSCM should be closely aligned with and driven by the overall BCM process, as a sub-set of this process. BCM manages risks to ensure that the organisation can continue to operate at a specified minimum level in case of a disaster. ITSCM is focused on the IT Services and ensures that the minimum of IT Services can be provided in case of disaster. One wont work with out the other. If the BCM process has a solid plan to evacuate part of the business process and continue to work in a separate building, but there is no IT infrastructure ready, the plan is of no use. The same apply if you do have plans which enable the IT organisation to provided the IT Service elsewhere if the business process cant be continued because theres is no contingency plan in place for that.
- 152 -
The process can be divided in 4 stages, which will be described in the next chapter in detail: Initiation Requirements and strategy Implementation Operational Management
- 153 -
Activities
Each of the stages has its own activities, which will be described in more detail throughout this chapter.
Initiation
Initiate BCM
The initiation process covers the whole of the organisation. The policies around BCM and ITSCM are defined, the scope of the process and the terms of reference determined, resources allocated and a project plan established.
The impact of a disaster on the business will be investigated. Questions that can be asked are: Can the business still operate in case of a disaster? For how long can it survive? Does it rely on the IT services to be able to operate? How much the organisation stands to lose as a result of a disaster or other service disruption and the speed of escalation of these losses will be assess by: o Identifying the critical business processes
- 154 -
Risk Assessment
This activity analyses the likelihood that a disaster or other serious service disruption will actually occur. This is an assessment of the level of threat and the extent to which an organisation is vulnerable to that threat. Risk Assessment consists of two parts: o Risk Analysis is focused on identifying the risks vulnerabilities of and the threats to all critical assets. by analysing the
Risk Management is about identifying countermeasures to keep those risks under control. These can either be actions to reduce the impact or likelihood of the risk, or the development of plans (Recovery Plans), which detail how to handle when the risk eventuates.
An appropriate strategy needs to be developed with contains an optimal balance of risk reduction and recovery options. The actual balance will very much depend on the nature of the business and the dependency on IT Services (e.g. a stockbroker will focus on risk reduction, while the local bakery can probably afford the time involved in recovering a failed system). In case of a Recovery Plan decisions have to made on how to recover. The options are:
- 155 -
Several plans need to be set up in order to be able to Implement the ITSCM process. These plans address issues like, emergency procedures, damage assessment, what to do with data, recovery plans etc.
The risk reduction measures need be implemented. In most cases this will be done with help of the Availability process. Also stand-by procedures will have to be put in place. For
- 156 -
The Recovery Plan (or Continuity plan) has to be set up. The plan should cover at least the following subjects: o o o o How it is going to be updated Routing list to specify which section has to go to which group? Recovery initiation Specialist sections to cover the actions and responsibilities of these sections individually. Sections are Administration, IT infrastructure, personnel, security, recovery sites and restoration.
Testing is a critical part of the overall ITSCM process and is the only way of ensuring that the selected strategy, stand-by arrangements, logistics, business recovery plans and procedures will work in practice.
These are essential issues that need to be taken care of in order for the ITSCM process to be successful. This ensures that all staff are aware of the implications of Business Continuity and of IT Service Continuity and consider these as part of their normal working routine and budget.
It is necessary to review and audit the plans on a regular basis to make sure they are still up to date.
Testing
By testing on a regular basis not only can the effectiveness of the plan be tested but also people will know what happens, where the plan is and what is in it.
Change Management
- 157 -
Assurance
The quality of the process is verified to assure that the business requirements can be met and that the operational management processes are working satisfactorily.
Roles
A distinction can be made in roles and responsibilities in and outside crisis times. Different levels within this process can be defined, starting with the board followed by senior management, management, team leaders and their team members. It is vital to document the responsibilities of each and every role. The main responsibilities of the ITSCM manager include: Develop and manage the ITSCM Plan to ensure that, at all times, the recovery objectives of the business can be achieved. Ensure that all IT Service areas are prepared and able to respond to an invocation of the Continuity Plans. Maintain a comprehensive IT testing schedule. Communicate and maintain awareness of ITSCM objectives within the business areas supported and the IT Service areas. Manage the IT Service delivery during times of crisis.
Relationships
ITSCM has a close relationship with all the other ITIL processes and the business in general. The relationship with some of the processes is described in more detail. Service Level Management Service Level Management provides the ITSCM process with information about the required service levels. Availability Management Availability Management has more a supportive role and helps the ITSCM process to prevent and reduce the risk of disasters by delivering / implementing risk reduction measures.
- 158 -
Service Desk in combination with Incident Management provides the ITSCM process with historical data (statistics).
Benefits
- 159 -
ITSCM supports the BCM process and delivers the required IT Infrastructure and Services to enable the business to continue to operate following a service disruption. The main benefits of implementing the ITSCM process are: Management of risk and the consequent reduction of the impact of failure. Potentially lower insurance premiums Fulfillment of mandatory or regulatory requirements Improved relationships between the business and IT through IT becoming more business focused, and more aware of business impacts and priorities. Increased Customer confidence, possible competitive advantage and increased organisational credibility.
In case of an actual disaster the process has the following benefits: Reduced business disruption, with an ability to recover services efficiently in business priority order. Recovery will take place in less time More stable IT infrastructure and a higher availability of the IT Services
Common Problems
A few of the problems one can encounter while implementing the ITSCM process are: Not enough resources to set up and run the process properly.
- 160 -
Metrics
- 161 -
Best practices
Interesting websites:
http://www.globalcontinuity.com/ http://www.microsoft.com/technet/itsolutions/idc/oag/oagc20.asp
http://www.iccmforum.com/iccm.asp?r=Tutorial&s=Benchmarks&t=ZiffDavishttp://www.disasterrecoveryworld.com/
Whitepages
http://www.interpromusa.com/IT Service Continuity Mgmt.pdf
Assesment
http://www.itil.co.uk/online_ordering/serv_del_graphs/itserv_cont.htm
Essentials (terminology)
BCM is concerned with managing risks to ensure that at all times an organisation can continue operating to, at least, a pre-determined minimum level. The BCM process involves reducing the risks to an acceptable level and planning for the recovery of business processes should a risk materialise and a disruption to the business occur. IT Disaster The unavailability for a longer period of time of IT Service provision which makes in necessary to switch to an alternative system and for which the actions to be taken are not part of a daily routine. Business Recovery Plans Documents describing the roles, responsibilities and actions necessary to resume business processes following a business disruption. Disaster Recovery Planning A series of processes that focus only upon the recovery process, principally in response to physical disasters that are contained within BCM.
- 162 -
- 163 -
Security Management
Introduction
Everyone has heard about the impact a virus can have on a business. Names as the Kournikova virus, Nimda and the Trojan Horse does ring bells about the vulnerability of our Business and the reliability of the businesses on IT services, The following example of a different nature occurred recently in The Netherlands. A national event would take place, which would attract a lot of attention. The event was the life chat session with Prince Willem Alexander and his fianc Maxima on the Internet. The main telecom provider in the Netherlands provided it and they bragged about how they could ensure the availability and the high performance of the event. A group of activists thought this was the time to show the country how vulnerable even the big companies are by hacking in to the systems, causing the servers to go down and so interrupting the life chat session for a period of time. In this case no harm was done but if they had bad intentions they could have easily caused a lot of damage In both cases there is a risk of information being damaged or misused due to a breach in security or lack thereof.
- 164 -
Objective
The objective of Security Management is twofold: To ensure that it complies with the external requirements of, legislation regarding privacy, insurance policies, and the SLAs. To create a secure environment regardless of the external requirements
- 165 -
Process Description
The process of Security Management is a flexible one and needs to be reviewed continuously to ensure that it is still up to date. It therefore should, plan, do, check and act in a continuous cycle. The activities of Security Management are undertaken either by the process itself or by other processes under the control of Security Management.
Activities
- 166 -
Roles
In most cases there will be only the Security Manager however in very large organisation there may be more persons involved in the process. The security manager is responsible for implementing and maintaining the process. The Security Manager has close ties with the Business Information Security officer
- 167 -
Relationships
The Security Management process has links with all the ITIL processes. Each process carries out one or more of the activities of Security Management. Although the responsibilities for these actions are still within the separate processes Security management provides the input for the activities. Service Level Management provides information about the required service levels and receives input about the achieved levels. Configuration management: The CMDB contains the information about the C.I.s. Every C.I. should be classified indicating the required availability, integrity and confidentiality, which will determine the level of security that is required. Incident and Problem Management: Incident Management records incidents regarding a breach of security levels and the cause is investigated and resolved by Problem Management. Change Management implements the changes, which ensure security or enhance it. On the other hand they need to address the security issues for every change. In most cases the Security Manager will be part of the CAB. Availability Management is supported by Security Management in the way that the measures to increase security result in a higher availability of the IT services.
Benefits
Benefits of implementing Security management are: Information that is vital to the business is kept secure Higher availability of the Information Quality of information that goes outside the business is increased.
- 168 -
Common Problems
The following issues may cause problems: Commitment: extra rules and regulations are most likely to generate resistance rather than appealing to end users. Attitude. Most security issues are caused by human errors. Quite often this is due to complacency. Verification. It needs to be possible to check if the security measures are working if they are there for the right reasons Changes. Over time the security aspect of changes might not get as much attention as is needed. Awareness. As with every other process it is important to communicate with the organisation to gain the cooperation of the business.
Metrics
The metrics of this process are similar to the ones of Service Level Management but focus on the security aspect. Is the security aspect covered of the services in the SLAs? Do the services within the SLAs have the necessary security aspects covered in the UPCs and or OLAs? Is there an improvement in the Security levels? Are the actual Security Levels measured? Is the perception of the IT organisation improving?
- 169 -
Best practices
http://www.securitymanagement.com/ http://www.ismanet.com/
Also....To Outsource or not to outsource? The challenge to meet security requirements, to prevent such disastrous impacts, is becoming more overbearing to organisations. Outsourcing security management could offer the solutions. The key questions faced by any organisation are: Or Should they contract such services to an outsourcing specialist who is using the latest available technology, tools and expertise to offer the most efficient service? The decision to outsource Security Management needs to be weighed carefully as this highly debatable decision has both pros and cons. Should they keep the responsibilities of information security in-house? Should they develop and train their own IT staff? Should they develop their own security policies?
Essentials (terminology)
Confidentiality Ensuring that information is accessible only to those authorised to have access.
Integrity Safeguarding the accuracy and completeness of information and processing methods.
Availability Ensuring that authorised users have access to information and associated assets when required
- 170 -
Verifiability Being able to verify that the information is used correctly and that the security measures are effective.
Conclusion
In recent years, IT Service Management has developed into a field in its own right. Organisations are now so dependent on the automation of large parts of their business processes that the quality of IT services and the synchronisation of these services with the needs of the organisation are now essential to their survival. This introduction to IT Service Management aims to provide a thorough introduction to the field. It not only provides a convenient introduction to the books in the IT Infrastructure Library (ITIL), but also serves as the first step to prepare for the Foundation Certificate exam in IT Service Management. It contains a wealth of practical experience collected by the editorial board. It is based on the latest edition of the ITIL books on Service Support and Service Delivery. This course aims to provide an effective introduction to the dynamic area of IT Service Management, and will be useful even for those not preparing for the exam. However, it does not pretend to have the answers to all the questions that arise in a field so multifaceted as IT Service Management. Instead, it aims to encourage discussions and to compare the best practices with the learner's own experience. We expect that this course will fulfill a clear need, and it deserves not just to be read and studied, but also to be used wisely in practice. Gerard Blokdijk, Managing Director The Art of Service
Further reading There are a wide variety of topics available as additional reading in this area. Searching using any internet search engine for topics like "ITIL" and "IT Service Management" will return good results. Also, visit the ITIL Bookstore: https://secure3.websitecomplete.com/itsmdirect/shop/showDept.asp?dept=22 ITIL web sites OGC/CCTA http://www.ogc.gov.uk EXIN http://www.exin-exams.com ITSMF http://www.itsmf.com ITIL http://www.itil.co.uk3
- 171 -