10 1 1 9

MARKOV REGENERATIVE PROCESS IN SHARPE
by
Wei Xie
Department of Electrical and Computer Engineering Duke University
Date: Approved:
Dr. Kishor S. Trivedi, Supervisor Dr. Allen M. Dewey Dr. Xiaobai Sun
A thesis submitted in partial fulllment of the requirements for the degree of Master of Science in the Department of Electrical and Computer Engineering in the Graduate School of Duke University 1999
Abstract
Markov regenerative processes (MRGPs) constitute a more general class of stochastic processes than traditional Markov processes. Markovian dependency, the rst-order dependency, is the simplest and most important dependency in stochastic processes. Past history of a Markov chain is summarized in the current state and the behavior of the system thereafter only depends on the current state. Sojourn time of a homogeneous continuous time Markov chain (CTMC) is exponentially distributed. However, non-exponentially distributed transitions abound in real life systems. Semi-Markov processes (SMPs) incorporate generally distributed sojourn times, but lack the ability to capture local behaviors during the intervals between successive regenerative points. MRGPs provide a natural generalization of semi-Markov processes with local behavior accounted. This thesis is devoted to studying a class of MRGPs and implementing the numerical MRGP solver integrated as a cornerstone in the versatile reliability, availability and performance modeling software package, SHARPE. In the class of MRGPs we study, at most one generally distributed transition is allowed in any state. This restriction, however, assures the subordinated stochastic process is a CTMC that is amenable for automated numerical analysis. We set forth by providing theoretical background of MRGPs and attempting to clarify the evolving threads from conventional stochastic processes to MRGPs. We then present important theorems fundamental to a steady-state solver. We turn to the issues in implementing the solver following the MRGP mathematical overview. After outlining the important common data structures in SHARPE, we describe key modules of the solver. MRGP syntax in SHARPE language is addressed and its usage is illustrated by examples. MRGPs are considered as the underlying stochastic processes of a class of stochasii
tic Petri nets (SPNs), namely Markov regenerative SPNs (MRSPNs). The connection between MRGP and MRSPN makes this study very useful given the fact that stochastic Petri nets are widely used as powerful modeling tools in many elds of engineering. We illustrate the benets and power of MRGP analysis by providing practical applications of performance and availability evaluation arising from computer and telecommunication systems.
iii
Acknowledgements
I would like to express my deepest gratitude towards my advisor, Professor Kishor S. Trivedi for his insightful advice and unreserved support. Every discussion with him always makes me think, claries confusions and inspires new ideas. I would also like to thank him for being so patient and considerate. He always encourages us to communicate with each other, to learn from each other and he himself is always willing to listen and tell you his opinion. I feel most fortunate to work under his supervision. I thank my committee members, Professor Allen Dewey and Professor Xiaobai Sun for agreeing to serve on the committee. I would like to thank Dr. Hoon Choi and Dr. Vidyadhar G. Kulkarni for being enthusiastic to help me when I met technical diculties. I learned much from their papers and books. I wish to thank my ocemates, both former and present, and other colleagues in the ECE department, Yonghuan Cao, Dr. Hairong Sun, Dong Chen, Xinyu Zang, Srinivasan Ramani, Christophe Hirel, Xuemei Lou, Nishith D. Shah, Liang Yin for the wonderful working environment they created and generous help they gave me. Without their unselsh support, I would never complete this thesis. My special thanks to Yonghuan, Hairong and Dong for providing me invaluable discussions and inspirations. I thank my parents and brother for teaching me to be independent and choose my own way. Without their endless love and encouragement, I would not have been able to pursue my advanced degree at Duke. Financial support of Alcatel is acknowledged via a Fellowship to CACC (Center for Advanced Computing and Communication). Financial support from the Department of Defense is also acknowledged via an enhancement project to CACC. iv
Contents
Abstract Acknowledgements List of Tables List of Figures 1 Introduction 1.1 1.2 1.3 Reliability, Availability and Performance Modeling . . . . . . . . . . ii iv vii viii 1 2 2 3 4 4 6 9 9 10 11 12 12 13 22 23 24
Markov Regenerative Process . . . . . . . . . . . . . . . . . . . . . . SHARPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Background 2.1 2.2 Stochastic Processes and Markov Chains . . . . . . . . . . . . . . . . Stochastic Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Markov Regenerative Process and Steady State Solver 3.1 Markov Regenerative Process . . . . . . . . . . . . . . . . . . . . . . 3.1.1 3.1.2 3.1.3 3.1.4 3.1.5 3.1.6 3.2 Renewal Process . . . . . . . . . . . . . . . . . . . . . . . . . Markov Renewal Process . . . . . . . . . . . . . . . . . . . . . Regenerative Process . . . . . . . . . . . . . . . . . . . . . . . Semi-Markov Process . . . . . . . . . . . . . . . . . . . . . . . Markov Regenerative Process . . . . . . . . . . . . . . . . . . Summary of Stochastic Processes . . . . . . . . . . . . . . . .
Implementation of MRGP Steady State Solver . . . . . . . . . . . . . 3.2.1 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . v
3.2.2 3.2.3 3.2.4
Data Structures in SHARPE . . . . . . . . . . . . . . . . . . . Input Part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solver Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . .
25 28 30 39 39
4 Applications of MRGP Solver 4.1 4.2 Application 1: M/G/1 Breakdown Queue . . . . . . . . . . . . . . . . Application 2: Detection and Restoration Times in Communication Networks Error Recovery . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 4.2.2 4.2.3 4.2.4 4.3 4.4 4.5 4.6 4.7 System Description . . . . . . . . . . . . . . . . . . . . . . . . The MRGP Model . . . . . . . . . . . . . . . . . . . . . . . . Reward Function for Loss Rates . . . . . . . . . . . . . . . . . Numerical Results . . . . . . . . . . . . . . . . . . . . . . . .
44 44 44 47 48 50 55 58 62 65 69 69 71 73
Application 3: Parallel System with a Single Repairman . . . . . . . Application 4: Warm Spare with Single Repairman . . . . . . . . . . Application 5: Vacation Queue . . . . . . . . . . . . . . . . . . . . . Application 6: Bulk Service System . . . . . . . . . . . . . . . . . . . Application 7: Software Rejuvenation . . . . . . . . . . . . . . . . . .
5 Conclusions and Future Work 5.1 5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bibliography
vi
List of Tables
2.1 4.1 4.2 Summary of SPNs and The Underlying Marking Processes . . . . . . M/D/1/3 Loss Probability vs. Service Time and Server Failure Rate . No Queueing Closed-Form Solution: Loss Rates vs. Detection Times (D) and Restoration Times (R); N = 1 . . . . . . . . . . . . . . . . . 8 43
48
4.3
No Queueing Closed-Form Solution: Loss Rates vs. Detection Times(D) and Restoration Times(R); N = 8 . . . . . . . . . . . . . . . . . . . . 48 No Queueing SHARPE Results: Loss Rates vs. Detection Times(D) and Restoration Times(R); N = 1 . . . . . . . . . . . . . . . . . . . . No Queueing SHARPE Results: Loss Rates vs. Detection Times(D) and Restoration Times(R); N = 8 . . . . . . . . . . . . . . . . . . . .
4.4
50
4.5
50
vii
List of Figures
2.1 2.2 GI/G/1/4 Queue SPN Model . . . . . . . . . . . . . . . . . . . . . . Reachability Graph of GI/G/1/4 Queue SPN Model; All possible markings are represented by ordered pairs (i, j), which means there are i tokens in the source, and j tokens in the system. . . . . . . . . . . . . A Typical Sample Path of a Renewal Process . . . . . . . . . . . . . 6
7 10 12 13 14
3.1 3.2 3.3 3.4 3.5
A Typical Sample Path of a Regenerative Process . . . . . . . . . . . A Typical Sample Path of a Semi-Markov Process . . . . . . . . . . . A Typical Sample Path of a Markov Regenerative Process . . . . . . MRSPN of M/G/1/3 Breakdown Queue; This is a single server queueing network, with Poisson arrival process, arrival rate , First Come First Serve (FCFS) scheduling policy, and generally distributed service times. The server may fail to work with a failure rate of , and the repair time is exponentially distributed with rate ;Thick black bars stand for GEN transitions, and white bars for EXP transitions . . . . MRGP of M/G/1/3 Breakdown Queue; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions . . . . . . . . . . . . . . . . . . . Summary of Stochastic Processes; Arrows mean generalization; Dotted Lines connect the Continuous Parameter Stochastic Processes with their Embedded Discrete Parameter Stochastic Processes. . . . . . . . SHARPE MRGP Specication for M/G/1/3 Breakdown Queue . . . The subordinated CTMC of state 110 of M/G/1/3 Breakdown Queue; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions .
15
3.6
17
3.7
22 29
3.8 3.9
32
3.10 The subordinated CTMC of state 210 of M/G/1/3 Breakdown Queue; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions . viii
33
4.1
Erlang Distribution; 1-stage, 2-stage, 10-stage, 50-stage Erlang Distribution and the Deterministic Distribution . . . . . . . . . . . . . . . 1:N Protection Switching System; N active channels and one spare channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CTMC with zero detection and restoration times . . . . . . . . . . . MRGP with deterministic detection and restoration times; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions . . . . . . . SHARPE MRGP Specication File of 1:N Switching System . . . . . MRSPN Model for Parallel System with Single Repairman Problem; Thick black bars represent GEN transitions, thick white bars represent EXP transitions, thin bars represent immediate transitions . . . . . .
41
4.2
45 45
4.3 4.4
46 49
4.5 4.6
51
4.7
Underlying MRGP of the Parallel System with Single Repairman Problem;Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions 52 SHARPE MRGP Specication of Parallel System with A Single Repairman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MRSPN Model for Warm Spare System with Single Repairman Problem; Thick black bars represent GEN transitions, thick white bars represent EXP transitions, thin bars represent immediate transitions .
4.8
53
4.9
56
4.10 Reachability Graph of Warm Spare System with Single Repairman MRSPN Model; Ovals represent tangible states, and rectangles represent vanishing states. Markings are labeled by a 5-tuple (#(Active up), #(Spare up), #(Active down), #(Spare down), #(Repair)). 4.11 Underlying MRGP of Warm Spare System with Single Repairman Problem; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.12 SHARPE MRGP Specication of Warm Standby System with A Single Repairman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
57
57
58
4.13 MRSPN Model for Vacation Queue; Thick black bars represent GEN transitions, thick white bars represent EXP transitions, thin bars represent immediate transitions . . . . . . . . . . . . . . . . . . . . . . . 4.14 Underlying MRGP of MRSPN Model for Vacation Queue; Each marking is denoted by a 4-tuple (#(pSource),#(pWait),#(pActive), #(pVacation)); Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.15 SHARPE MRGP Specication of Vacation System . . . . . . . . . .
59
59 60
4.16 MRGP Model of the Bulk Service System; Queue length is N; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions . . 4.17 SHARPE MRGP Specication of Bulk System . . . . . . . . . . . . . 4.18 MRGP Model of the Software Rejuvenation Analysis . . . . . . . . . 4.19 MRGP Code for Software Rejuvenation . . . . . . . . . . . . . . . . .
63 64 66 68
Chapter 1 Introduction
The most straightforward way to study a system is to test and observe an existing one. The obvious disadvantage of this approach is that it is either impossible when, for example, the system is only envisioned by its designer in the early stage of a design, or prohibitively costly to build one when the complexity of the system exceeds a certain level, or even dangerous if testing on such systems invokes danger to life. So, it is for this reason that system designers and analysts turn to modeling and simulation. Generally, the modeling, especially analytical modeling, tends to shed more insight in understanding the system. Systems that we are referring to, be it a communication system in engineering, a dynamic price system in economics, or a population system in sociology, can be very complicated. Therefore, a modeler has to deal with great complexity. However, normally the purpose of studying a system, especially those in engineering, is to evaluate or predict system performance in terms of measures of interest. It is fortuitous that, with regard to the accuracy of prediction or evaluation, only a limited number of factors or elements in systems under study are critical or dominant. Thus, extracting those factors of great importance from real systems enables a modeler to work on models with a level of abstraction without losing accuracy. It has long been observed that the behaviors of many real systems of interest are not deterministic but of probabilistic nature. Deterministic models by its nature are not sucient to capture randomness. Stochastic models are more appropriate in modeling practical systems. Stochastic models pervade in the elds of science and engineering. This thesis is dedicated to study one particular stochastic model, 1
Markov regenerative process (MRGP), which plays an increasingly important role especially in reliability, availability and performance analyses. In the following sections, we rst briey introduce basic concepts of reliability, availability and performance modeling. We then point out the necessity of introducing Markov regenerative process and of integrating a solver in the existing software package of reliability, availability and performance, SHARPE (Symbolic Hierarchical Automated Reliability and Performability Evaluator). We outline the organization of this thesis at the end of this introduction.
1.1
Reliability, Availability and Performance Modeling
The past two decades have witnessed an explosive growth in technologies. It is fair to say that, with todays technology, realization of functions is no longer the only thing that designers need to worry about. People are paying more and more attention to system reliability, availability and performability. To answer questions from users like How reliable is the machine? Will this device fail in the next ten years?, or Given that the system works, how well does it work?, and to give guidelines to achieve high quality, reliability and performance in these systems, there is a need to systematically study the methodologies of evaluating system reliability and performance.
1.2
Markov Regenerative Process
Markov regenerative process is a type of stochastic process, which is a probabilistic model of a system evolving randomly with time. An independent process has the simplest form of the joint distribution, but the assumption of independence is likely to be violated in the real world. The Markov process has the most important and simplest type of dependence rst-order of dependence, known as Markov depen2
dence. Markov regenerative process that is the focus of this thesis, is a more general class of stochastic process which has its special case of semi-Markov process as well as discrete and continuous Markov chains.
1.3
SHARPE
Symbolic Hierarchical Automated Reliability and Performability Evaluator (SHARPE) is a software package capable of solving both non-state-space (combinatorial) and state-space models. The rst version was written in C programming language and released in 1986. The model types which have been incorporated in current version include reliability block diagrams, fault trees, reliability graphs, Markov chains, semiMarkov chains, single-chain product-form queueing networks, multiple-chain productform queueing networks, generalized stochastic Petri nets (GSPN) and Markov regenerative process (MRGP). For most models listed above, both transient and steady state analysis are available. As its name implies, SHARPE is also capable of handling hierarchical models and giving out semi-symbolic (or semi-numeric) results. This thesis is organized as follows: in Chapter 2, we introduce some basic reliability and performance models, and in Chapter 3 we will concentrate on the theory of Markov regenerative process, and the implementation of the steady state solver in SHARPE. Then in Chapter 4 several examples are given. Chapter 5 contains the conclusion and future work.
Chapter 2 Background
In this chapter, we give the denitions of stochastic processes, and of Markov chains. Then we introduce stochastic Petri net (SPN), which is a more powerful modeling paradigm than Markov chains. Several types of stochastic Petri nets are described and a brief summary of these Petri nets and their underlying stochastic processes are given at the end of this chapter.
2.1
Stochastic Processes and Markov Chains
Denition(Stochastic Process). A stochastic process is a family of random variables {X(t)|t T }, dened on a given probability space, indexed by the parameter t, where t varies over an index set T . The set of all possible values that random variables can take is called the state space. If the state space of a stochastic process is discrete, it is called a discretestate process, or usually a chain. If the state space is continuous, then the stochastic process is a continuous-state process. We can classify the stochastic processes into discrete-parameter processes or continuous-parameter processes if the parameter set T is discrete or continuous, respectively. If the nite dimensional joint distribution of a stochastic process {X(t)|t T } satises: Fn (x1 , x2 , . . . , xn ) = P {X(t1 ) x1 , X(t2 ) x2 , . . . , X(tn ) xn } =
n i=1
P {X(ti ) xi }
for 0 t1 t2 . . . tn , the stochastic process is called an independent process. 4
Although it is easy to study, most real life processes do have some dependencies among these random variables. The most important and most common one is the rst-order dependency, which is known as Markov dependency. Denition(Markov Process). A stochastic process {X(t)|t T } is called a
Markov process if for any t0 < t1 < t2 < . . . < tn < t, the conditional distribution of X(t) for given values of X(t0 ), . . . , X(tn ) depends only on X(tn ); that is P {X(t) x|X(tn ) = xn , X(tn1 ) = xn1 , . . . , X(t0 ) = x0 } = P {X(t) x|X(tn ) = xn }. (2.1)
The denition says that the next state of a Markov process may only depend on the current state. No information about the prior sequence of states visited could aect the next transition. If the state space is discrete, we call such a stochastic process a Markov chain. We will encounter both the continuous time Markov chains (CTMC), in which the parameter t is continuous, and the discrete time Markov chains (DTMC), in which the parameter t is discrete, in MRGP studies. The transition behaviors are characterized by transition rates or transition probabilities for CTMC and DTMC, respectively. In many practical problems, the time origin does not matter, i.e., only the time elapsed decides the chain behavior. Such kinds of Markov chains are time homogeneous, which means: P {X(t) x|X(tn ) = xn } = P {X(t tn ) x|X(0) = xn } It is shown [25] that the sojourn time of a homogeneous continuous time Markov chain is exponentially distributed (memoryless property). The probability that the process stays in state i at time t > tn given it was in state i at time tn only depends on state i, but does not depend on how much time it has spent in state i. 5
An intuitive extension of Markov process is that transition rates from one state to another not only depend on the current state, but also the duration the process spends in that state. Note that the Markov property still holds at the time of entry and exit from or to states. Such a process is called a semi-Markov process. We will describe semi-Markov processes and several other types of stochastic processes in Chapter 3.
2.2
Stochastic Petri Nets
In this section we will introduce a powerful modeling paradigm, stochastic Petri nets [20, 25].
arrival
Queue
service
Figure 2.1: GI/G/1/4 Queue SPN Model We illustrate the structure of a Petri net by an example. Figure 2.1 shows an SPN model of a GI/G/1/4 queue. Following Kendalls notation [27], a GI/G/1/4 queue has a single server with a queue of length 3, generally distributed independent interarrival times and generally distributed service times. The basic components of a Petri net include places, transitions, arcs and tokens. Queue is the only place (represented by a circle) in Figure 2.1, and there are three tokens (represented by black dots) in place Queue. arrival and service are transitions (represented by bars) of the job arrival and job service completion, respectively. Arcs start from places and end at transitions, or from transitions to places. Tokens can move from one place to another when a transition res via the arcs connecting them. The state of the system being modeled is identied by a Marking, which is a vector of tokens in 6
arrival 0 service 1
arrival 2 service
arrival 3 service
arrival 4 service
Figure 2.2: Reachability Graph of GI/G/1/4 Queue SPN Model; All possible markings are represented by ordered pairs (i, j), which means there are i tokens in the source, and j tokens in the system. each place. In the original Petri net, transitions are untimed, while stochastic Petri nets (SPN) allow timed transitions. Given the initial marking and the SPN, we can nd the reachability set, that is, all possible markings of the SPN, and the unique reachability graph. Figure 2.2 shows the reachability graph of the GI/G/1/4 queue. Markings are shown as (# Queue tokens). The reachability set of the GI/G/1/4 queue is {4, 3, 2, 1, 0}, as shown. It is the types of transition ring times that mainly dierentiate the types of Petri nets. The most frequently used transitions have exponentially distributed ring times (EXP). The Generalized Stochastic Petri Net (GSPN) [19] also allows zero ring time transitions, which result in vanishing markings in the reachability graph. The underlying stochastic process of SPN or GSPN is a CTMC [20, 19]. The Extended Stochastic Petri Net (ESPN) [5] includes transitions with generally distributed ring times (GEN) with some restrictions, and hence the underlying stochastic process of ESPN is a semi-Markov Process [5]. The Deterministic Stochastic Petri Net (DSPN) [18, 13] introduced deterministic ring time transitions (DET) with certain restrictions, which has an underlying stochastic process as the Markov regenerative process (MRGP) [3], introduced in next chapter. In [4], MRSPN is dened as stochastic Petri nets in which the marking process is a Markov regenerative process. We only consider MRSPNs with at most one generally distributed timed transition enabled in a marking, so that the subordinated stochastic process of the underlying MRGP is a CTMC, which has been studied extensively. Table 2.2 is a summary of stochastic Petri nets and their underlying marking
Table 2.1: Summary of SPNs and The Underlying Marking Processes SPN Type and Structural Restrictions SPN (only exponential transitions) GSPN (exponential and immediate) ESPN (general distributions with restrictions) DSPN (deterministic transitions competitively enabled with exponential transitions, but not concurrently enabled with other transitions) [5] DSPN (deterministic transitions competitively or concurrently enabled with other exponential distributions, but not concurrently enabled with another deterministic transition) [3] MRSPN (general transitions concurrently and/or competitively enabled with exponential transitions, but not concurrently enabled with another general transitions) [4] DSPN (deterministic transitions concurrently enabled with other deterministic transitions with restrictions, concurrently and/or competitively enabled with exponential transitions) [13] Underlying Marking Processes Continuous Time Markov Chain Semi-Markov Chain
Semi-Markov Chain
Markov Regenerative Process with subordinated CTMC
Markov Regenerative Process with subordinated CTMC
Markov Regenerative Process with subordinated Markov Regenerative Process.
processes. In the next chapter, we concentrate on MRGP with subordinated CTMC. Then the SHARPE MRGP steady state solver is described.
Chapter 3 Markov Regenerative Process and Steady State Solver

In this chapter, we introduce Markov regenerative process (MRGP) and the algorithm used in the steady state solver in SHARPE. We rst introduce renewal process, Markov renewal process, regenerative process and semi-Markov process, then we dene Markov regenerative processes. The relationships between Markov regenerative process and other stochastic processes are addressed and illustrated by examples. Theorems are stated regarding the transient and the steady state analysis of Markov regenerative processes. After giving theoretical background of MRGP, we then present the implementation of MRGP steady state solver in SHARPE. First, the structure and features of the software package SHARPE are outlined, then we focus on the input part and the solver kernel. The former reads the users MRGP specication le, checks if it is legal, and stores it into the internal data structure, while the latter is responsible for solving MRGP models using the related theorems.
3.1
Before dening the Markov regenerative process, we need to introduce some fundamental stochastic processes.
3.1.1
Renewal Process
Denition(Renewal Sequence and Renewal Process). {Sn , n 0} is said to be a renewal sequence and {N (t), t 0} a renewal process generated by {Xn , n 1} if {Xn , n 1} is a sequence of independently and identically distributed non-negative random variables, where Sn = X1 + X2 + . . . + Xn , n 1, S0 = 0, N (t) = sup{n 0 : Sn t}. (3.1) (3.2) (3.3)
The random variable Xn is the time interval between the successive arrivals ((n 1)th and the nth). Since {Xn , n 1} are independently and identically distributed, probabilistically exactly the same process is repeated at each time epoch Sn . Note that Sn is the absolute time of the nth arrival, assuming our time origin is at zero (S0 = 0). N (t) is the total number of arrivals by time t. Poisson process is a special case of renewal process, with the exponential inter-arrival distribution.
N(t) 8 7 6 5 4 3 2 1 0 S0 S1 S2 S3 S4 S5 S6
Figure 3.1: A Typical Sample Path of a Renewal Process
10
A typical sample path of a renewal process is shown in Figure 3.1. Clearly, a renewal process N (t) is a continuous time, discrete state stochastic process. The inter-arrival times Xn = (Sn Sn1 ), n > 0 have identical independent distributions, and N (t) a is non-decreasing function. Note that {Xn , n 1} is an independent process.
3.1.2
Markov Renewal Process
Denition(Markov Renewal Sequence and Markov Renewal Process). {(Yn , Sn ), n 0} is said to be a Markov renewal sequence with state space I if for all n 0 and i, j I, P {Yn+1 = j, Sn+1 Sn x|Yn = i, Sn , Yn1 , Sn1 , . . . , Y0 , S0 } = P {Yn+1 = j, Sn+1 Sn x|Yn = i} = P {Y1 = j, S1 x|Y0 = i}. (3.4)
The vector-valued stochastic process N(t) = (Nj (t), j I) is dened as a Markov renewal process, where
N (t)
Nj (t) =
n=1
Zj (n). 1 if Yn = j 0 otherwise
(3.5)
Zj (n) =
(3.6) (3.7)
N (t) = sup{n 0 : Sn t}.
Nj (t) is the number of times state j is visited by time t, N (t) is total number of state changes by time t. Markov renewal process is a more general class of stochastic process than renewal process. The processes over each interval are not independent, but have one-order dependency. This is a limited form of Markov dependency, i.e., 11
the future evolution of the stochastic process only depends on the current state of the process at Markov renewal points. In Markov renewal process, we are only interested in the states changes at time epochs Sn s.
3.1.3
Regenerative Process
Denition(Regenerative Process). {Z(t), t 0} is called a regenerative process if there is a non-negative random variable S1 so that {Z(t + S1 )} is stochastically identical to {Z(t), t 0}, and {Z(t + S1 )} is independent of {Z(t), 0 t < S1 }.
Z(t)
S0
S1
S2
S3
S4 S5
S6
Figure 3.2: A Typical Sample Path of a Regenerative Process Figure 3.2 is a typical sample path of a regenerative process. Similar to renewal process, the time intervals of the renewal points are of identical and independent distribution, but the process Z(t) is not necessarily discrete or non-decreasing. It could be any arbitrary stochastic process, given Z(t)s are independent and identical processes over each interval.
3.1.4
Semi-Markov Process
Denition(Semi-Markov Process). Given a Markov renewal sequence {Yn , Sn } with state space I, stochastic process {Z(t), t 0} is called a semi-Markov process with state space I if Z(t) = Yn for t [Sn , Sn+1 ).
12
Z(t)
S0
S1
S2
S3
S4 S5
S6
Figure 3.3: A Typical Sample Path of a Semi-Markov Process Figure 3.3 gives a typical sample path of a semi-Markov process. Obviously, the sample path is piecewise constant and right continuous, and jumps only happen at the Markov renewal points Sn s. The inter-arrival times (Sn Sn1 ) are generally distributed. CTMC is a special case of SMP, with the exponentially distributed inter-arrival times.
3.1.5
After introducing the fundamental stochastic processes, we can proceed to the Markov regenerative process. Denition(Markov Regenerative Process). A stochastic process {Z(t), t 0} is called a Markov regenerative process (MRGP) if there exists a Markov renewal sequence {(Yn , Sn ), n 0} of random variables such that all the conditional nite dimensional distributions of {Z(Sn + t), t 0} given {Z(u), 0 u Sn , Yn = i} are the same as those of {Z(t), t 0} given Y0 = i. As the denition says, the Markov regenerative process is also constructed from the Markov renewal process, and in most applications, Yn is just dened as Z(Sn +) or Z(Sn ). Figure 3.4 is a typical MRGP sample path. Each Sn is a Markov regenerative point because the stochastic process evolution from that point on is independent 13
Z(t)
S0
S1
S2
S3
S4 S5
S6
Figure 3.4: A Typical Sample Path of a Markov Regenerative Process of the history before it. For a semi-Markov process, no state change occurs between successive Markov regenerative points, that is, P {Z(t) = i, Sn t < Sn+1 |Z(Sn ) = i} = 1. The sample paths of semi-Markov chains are piecewise constant, right continuous and Sn is when the nth jump occurs. But for Markov regenerative process, the stochastic process between Sn and Sn+1 could be any continuous-time stochastic process, such as CTMC, semi-Markov, even another Markov regenerative process. Hence the sample paths are no longer piecewise constant - local behaviors exist between two consecutive Markov regenerative points. The jumps do not necessarily have to be at the Sn s. This is clear by comparing Figure 3.4 with Figure 3.3. Like semi-Markov processes, Markov regenerative process allows non-exponentially distributed ring time transitions, but it is an even more general process. If a transition t is the only transition out of state i, t is called an exclusive transition. A transition t is said to be competitive with respect to another transition t if both t and t can occur in state i and the ring of t disables t . If the ring of t does not disable t , t is said to be concurrent with t . Semi-Markov chains do not allow concurrent EXP or GEN transitions, while MRGP just dened can have all these types of transitions. The MRGP above without any restrictions is the most general type of MRGP with very complicated subordinated stochastic processes (the so-called local behav14
M/G/1/3 Breakdown Queue MRSPN Model 3 G arrival Queue service fail Up repair Down
Figure 3.5: MRSPN of M/G/1/3 Breakdown Queue; This is a single server queueing network, with Poisson arrival process, arrival rate , First Come First Serve (FCFS) scheduling policy, and generally distributed service times. The server may fail to work with a failure rate of , and the repair time is exponentially distributed with rate ;Thick black bars stand for GEN transitions, and white bars for EXP transitions iors or the marking-process between two consecutive Markov renewal points), and hence extremely hard to solve. But in many problems of interest, at most one GEN transition is enabled in a state, and the ring time distribution of the GEN transition may depend on the state at the time it is enabled, but cannot change until it res or is disabled. The subordinated stochastic process of this type of MRGP is a CTMC, which is solvable. For example, the underlying stochastic processes of MRSPNs belong to this class of MRGP. Figure 3.5 shows the MRSPN model of an M/G/1/3 breakdown queue. The job arrival process is an exponential one with rate , and the service time is generally distributed. The queue length is 3. The server is subject to failure at rate , and the server repair rate is . We assume that the queue stops receiving new jobs when the server is down. In this model, there is only one general distribution, hence the underlying MRGP (Figure 3.6) of this model satises the restriction, that is, there is at most one GEN transition is enabled at any state. In Figure 3.6, all possible markings are represented by 3-tuples (i, j, k), where i, j 15
and k are the numbers of tokens in places Queue, Up and Down, respectively. Initially, the system is in state 010, which means that the queue is empty and the server is up. If a new job arrives, the system jumps to state 110, and the server begins to process the job (the GEN transition service is enabled). If before the job is nished, the server goes down, the system enters state 101. This exponential transition (from state 110 to state 101, represented by solid thin line) is a competitive one, because ring of this transition disables the GEN transition. After the server is repaired, the system is brought back to state 110, and the service of the job is restarted. Another new job can arrive at rate , without interfering the service of the previous job, hence this transition is a concurrent one (from state 110 to state 210, represented by dashed thin line). Similarly, transition from state 210 to 310 is concurrent too. Dene to be the set of all tangible states of the MRGP of interest. The matrix K(t) = [Kij (t)] is called the (global) kernel of the MRGP where the conditional probability Kij (t) = P {Y1 = j, S1 t|Y0 = i}, i, j . The matrix E(t) = [Eij (t)] is called the local kernel where the conditional probability Eij (t) = P {M (t) = j, S1 > t|Y0 = i}, i, j . As their names imply, global kernel K describes the process behavior immediately after the next Markov regenerative point (S1 ), while the local kernel E is for the behavior between two Markov regenerative points. It was proven [4] that the transition probability matrix V (t) satises the following generalized Markov renewal equation: V (t) = E(t) + K V (t) where V (t) = [Vij (t)] Vij (t) = P {Z(t) = j|Z(0) = Y0 = i}, i, j , 16
010 110 210 310

Figure 3.6: MRGP of M/G/1/3 Breakdown Queue; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions Kiu Vuj (t) =
t 0
001 g 101 g g 301 201
Vuj (t x)dKiu (x)
Note that {Yn , n 0} is a discrete-time Markov chain (DTMC), with the one-step transition probability matrix of K(). {Yn , n 0} is called the embedded Markov chain (EMC) for the MRGP. It is useful to rst study the transient behavior before studying the steady-state solution of MRGP models. First, several denitions are given following the development in [4]. Transition rates and probabilities: Transition t is an EXP type with ring rate t (mi ), then the transition rate 17
(mi , mk ) = t (mi ), assuming mk is the state directly connected from mi by transition t. Otherwise, (mi , mk ) = 0. Transition t is a GEN transition, then the branching probability (i, k) = P {next state is mk |current state is mi , t res} = 1, if immediately after ring of t, the process is in state mk . Otherwise, (i, k) = 0 Dene E (m) to be the set of states connected (not necessarily directly) by a competitive EXP transition from state m, G (m) to be the set of states reachable (not necessarily directly) from state m by ring a GEN transition. Dene G(m) to be the set of GEN transitions enabled at state m. We only consider the case of MRSPN, i.e., G(m) has at most one element. Dene E(m) to be the set of EXP transitions enabled in state m. If given Y0 = m, no GEN transition is enabled, G(m) = , the time to the next Markov regenerative point S1 is exponentially distributed with rate m , where m =
tE(m) + t (m). The transition probability to state n at time Sn+1 is (m, n)/m ,
where the (m, n) is the transition rate from state m to state n. Also there is no local behavior in this case, which means that Z(t) = Y0 , 0 t < S1 . This is the exactly the same case as pure Markov chains. If given Y0 = m, G(m) = {g}, i.e., the GEN transition g is enabled in state m, the next Markov regenerative point is when the GEN transition res or is disabled due to a ring of a competitive EXP transition. Now we dene (m) to be the set of all tangible states reachable (not necessarily directly) from state m before next Markov regenerative point (before next EMC transition occurs), i.e., during [0, S1 ). For example, the (m), E (m) and G (m) of several states in Figure 3.6 are given as below: (010) = {010}, E (010) = {110, 001}, G (010) = ; 18
(110) = {110, 210, 310}, E (010) = {101, 201, 301}, G (010) = {010, 110, 210}; (210) = {210, 310}, E (210) = {201, 301}, G (210) = {110, 210}; (310) = {310}, E (310) = {301}, G (310) = {210}; (001) = {001}, E (001) = {010}, G (001) = . Under the assumptions above, the stochastic process before S1 is a CTMC on state space , with the initial state m, which is dened as the subordinated CTMC. This CTMC consists of all states of (m) and all states of E (m). All edges from states in (m) are included, but those from states in E (m) are not. More precisely, the CTMC innitesimal generator matrix entry [Q(m)]nn = (n, n ) if n (m) and n = n , and
n =n
(n, n ) if n = n , [Q(m)]nn = 0 if n (m). As we will see in
the following sections, the techniques of solving the subordinated CTMC are mature and suitable for automated computer analysis. Note that we also assume that (m) and E (m) are mutually exclusive. If this assumption does not hold, we can adopt a modied innitesimal generator matrix of the subordinated CTMC to solve this problem. Following are some important theorems of the local kernel E(t) and the global kernel K(t). The proofs can be found in [4].
THEOREM 1. by:
The local kernel E(t) = [Emn (t)] (m, n ) is given
if G(m) = , Emn (t) = mn em t , where mn = 1 if m = n and 0 otherwise. if G(m) = {g}, Emn (t) = [eQ(m)t ]mn (1 Fg (t)) if n (m), and 0 otherwise. 19
THEOREM 2. if G(m) = , Kmn (t) =
The kernel K(t) = [Kmn (t)] (m, n ) is given by:
(m, n) (1 em t ) if m > 0, and 0 otherwise. m
if G(m) = {g}, 1. if n E (m) but n G (m), Kmn (t) = [eQ(m)t ]mn (1 Fg (t)) + 2. if n E (m) but n G (m), Kmn (t) =
t Q(m)x ]mm m (m) 0 [e t Q(m)x ]mn dFg (x); 0 [e
dFg (x)(m , n);
3. if n E (m) and n G (m), Kmn (t) = [eQ(m)t ]mn (1 Fg (t)) + +

t Q(m)x ]mm m (m) 0 [e t Q(m)x ]mn dFg (x) 0 [e
dFg (x)(m , n);
4. if n E (m) and n G (m), Kmn (t) = 0 .
THEOREM 3.
The one-step transition probability matrix P =
[Pmn ] (m, n ) of the EMC is given by: if G(m) = , Pmn = if G(m) = {g}, 1. if n E (m) but n G (m), Pmn =
Q(m)x ]mn dFg (x); 0 [e
(m, n) if m > 0, and 0 otherwise. m
20
2. if n E (m) but n G (m), Pmn =

Q(m)x ]mm m (m) 0 [e
dFg (x)(m , n);
3. if n E (m) and n G (m), Pmn = +

Q(m)x ]mn dFg (x) 0 [e Q(m)x ]mm m (m) 0 [e
dFg (x)(m , n);
4. if n E (m) and n G (m), Kmn (t) = 0 .
After getting the local kernel E(t) and one-step transition probability matrix P , we are ready to determine the steady state probabilities. We assume that the limiting probability distributions exist, i.e., the EMC is nite and ergodic (irreducible, aperiodic and positive recurrent). We dene m = E[S1 |Y0 = m]; mn =
0
(3.8) (3.9) i = 1;
i
Emn (t)dt; where
= P, k =
(3.10) (3.11)
k k . r r r
Give Y0 = m, m is the expected time interval spent before next EMC transition, mn is the expected time spent by the MRGP in state n during [0, S1 ), is the steady-state probability vector of the EMC.
THEOREM 4.
The limiting probability vector p = [pj ] of the state
probabilities of the MRGP is given by pj = lim P {Z(t) = j|Y0 = m} =

t k
k kj kj = k k k k k k
21
The theorem can be interpreted as below: pj =

k
(fraction of time spent in EMC state k) (time spent in MRGP state j in each visit of EMC state k) (time spent in each visit of EMC state k)
=
k
kj k
(3.12)
The proof of this theorem can be found in [12].
3.1.6
Summary of Stochastic Processes

Continuous Parameter
Poisson Process
Discrete Parameter
Renewal Process
Renewal Sequence
CTMC
DTMC
Regenerative Process Markov Renewal Process
Semi-Markov Process
Markov Renewal Sequence
Figure 3.7: Summary of Stochastic Processes; Arrows mean generalization; Dotted Lines connect the Continuous Parameter Stochastic Processes with their Embedded Discrete Parameter Stochastic Processes. Figure 3.7 is a brief summary of the stochastic processes discussed in this thesis. Poisson process is a special case of renewal process and CTMC. If the inter-arrival time distribution of a renewal process is exponential, it is known as a Poisson process. Its underlying Markov chain is a pure birth process with state space of non-negative integers. The transition rate out of each state is , the parameter of the exponential distribution. Markov renewal process N(t) = (Nj (t), j I) is based on a state space 22
I {0, 1, 2, . . .} while renewal process N (t) has only a single state. Although N(t) is vector-valued, every Nj (t) is discrete and nondecreasing, similar to the renewal process N (t) (also known as renewal counting process). The Markov renewal process, as its name suggests, has the Markov property at the renewal points, i.e., all past history is summarized in the current state and the future of the process only depends on the current state. In Markov renewal processes, we only observe the system at Markov renewal points Sn s. Regenerative process is also a natural generalization of renewal processes. If there are identical independent stochastic processes evolving over the renewal intervals, we end up with a regenerative process. Only exponentially distributed ring times are allowed in (homogeneous) Markov chains until semi-Markov process (SMP) appears. Semi-Markov process is a more general type of stochastic processes than homogeneous CTMC, Markov renewal process and (homogeneous) DTMC. But in SMP, no state change occurs between two consecutive Markov renewal points. If we lift this restriction and allow local state changes, the process is known as Markov regenerative process (MRGP). Several other diagrams of the relationships among stochastic processes can be found in [17, 16, 11]. We will introduce the algorithm of solving steady state MRGP probabilities used in SHARPE in the next section.
3.2
Implementation of MRGP Steady State Solver
SHARPE, Symbolic Hierarchical Automated Reliability and Performance Evaluator, is a versatile tool that solves stochastic models of reliability, performance and performability. First nished in 1986, SHARPE contains approximately 36,000 lines of C code in 30 les encoded in nearly 400 functions. SHARPE has integrated models including: combinatorial reliability models, such as reliability block diagrams, fault trees 23
and reliability graphs; directed, acyclic task precedence graphs; Markov and semi-Markov models, including Markov reward models; product-form queueing networks; generalized stochastic Petri nets. As its name suggests, SHARPE is a hierarchical modeler in which users can decompose large models into several sub-models, ask SHARPE to solve them individually and assemble the outputs together. SHARPE code can be divided into three large parts: the input part, the solver kernel and the output part. Although the models which have been integrated into SHARPE are quite dierent from each other, from model specications to solution techniques, they still bear many common features from a users perspective. Furthermore, many basic underlying data structures and subroutines for solving these models are also shared by all these models. Therefore, it is necessary to look at the uniform user interface and the common data structure.
3.2.1
User Interface
Users can specify their models and ask for results in SHARPE language, which is an interpretive language. Two modes of operation are available in SHARPE: batch mode (input from les) and interactive mode (input from a terminal). The syntax of all SHARPE models are similar to each other. MRGP is a typical SHARPE model and we will use it as an example to give an overview of SHARPE. The following is the MRGP model specication syntax:
24
mrgp system name {( param list) } section 1: transitions and transition distributions < nodename1 edgetype nodename2 dist > ... section 2: rewards (optional) { reward } < name expression > ... end asking for results { expr prob(system name, nodename{;arg list})} ... { expr exrss(system name{;arg list})} end
The rst line tells SHARPE that the denition of an MRGP model called system name begins, and the following lines specify the transitions of the model. Comment lines begin with an asterisk (). If keyword reward is found, reward values are read until end line. When the MRGP model is fully described, the user can ask for results using the commands expr, prob and exrss (the same as in irreducible Markov and semi-Markov models), so SHARPE MRGP kernel is called, and the result is obtained after the model is successfully solved. A complete reference to SHARPE can be found in [25]. In the next section, we examine the SHARPE data structures.
3.2.2
Data Structures in SHARPE
As described previously, SHARPE is written in C programming language. Although C is not an Object-Oriented Programming (OOP) language, it enjoys the advantages 25
of high speed, eciency, simplicity and exibility, compared with OOP languages such as C++ and Java. There are several important global variables and common structures in SHARPE which are crucial to the understanding of the MRGP solver implementation: the system information structure, the Markov node structure, the Markov edge structure, the exponomial structure, and the expression structure. System Information Structure All the system information, i.e., information of each sub-model, is stored in the structure system infoT, which consists of the system name, system type, pointers for dierent models, such as pointers to Markov chain nodes, Markov chain edges, Markov chain initial state probabilities, reward rates, etc. Markov Node Structure and Edge Structure For each model, there are several data structures designed to contain the related model information. For example, structure mnode infoT is the structure to hold information of a Markov, semi-Markov or MRGP chain state, reliability graph node or a queueing network station. It has the name of the node, a list of its immediate successors and predecessors, its steady-state probability, total rate-out (m ), transient probability, reward rate value, etc. For Markov chains, there is also an edge structure edgeT, which is for each edge of the Markov chain. It has the starting node and ending node indices of the edge, the distribution type and pointers to the cumulative distribution function (CDF) for semi-Markov and MRGP, transition rate for Markov chains, etc. There are pointers in system infoT pointing to the arrays of the node and edge structures.
26
Exponomial Structure The basic distribution SHARPE processes is exponential polynomial (exponomial) distribution, which is the sum of atk ebt terms, where k is a non-negative integer and a and b could be real or complex numbers. Many useful distributions can be represented or approximated by exponomials, such as exponential distribution, Erlang distributions, deterministic distributions, etc. Of course, the exponomial distributions need to satisfy the requirements of CDF. For instance, when t , the CDF is less than or equal to one, and when t = 0, the CDF is 0. The most important reason of using exponomial is that it is suitable for automated computations. For example, convolution, integration and dierentiation of exponomials are considerably simpler than those for other general distributions. Also semi-symbolic analysis becomes feasible with exponomials. The structure exponodeT contains a single term of an exponomial. It has a, b and k declared in proper types, and a pointer to the next exponomial term. The whole exponomial is stored as a singly circularly linked list, with a head node. The head node is identied by the k value of 1. All the nodes are sorted rst by decreasing value of k, then by decreasing value of real part of b. Expression Structure This structure, named eeT, is used to store raw expression structure read from the user les, such as reading the user dened functions, system names, node names and parameters when asking for results. Each eeT type object represents a token of an expression. It consists of token arguments, values, component name, symbol index, the pointer to next token, etc. After an expression is read into the expression structure, appropriate functions can interpret it and validate it.
27
Global Variables There is a group of global variables used to keep some important system information shared by all SHARPE functions. system infoT system info is the pointer of all the subsystem information. int num systems keeps track of the total number of subsystems. SHARPE also maintains a symbol table for the user dened variables and functions. Among these global variables, there are pointers to the stacks of free exponomial nodes, expression nodes, etc.
3.2.3
Input Part
As stated before, SHARPE is an interpretive language and the input part can work in either interactive mode or batch mode. Since the syntax of these two modes are essentially the same, we will only use the batch mode to illustrate how it works. After the initialization and le opening, SHARPE enters its main loop. It reads one line each time, looks for the keywords of each line, and calls corresponding subroutines. Consider the M/D/1/3 breakdown queue MRGP model specication le (Figure 3.8): The keyword mrgp followed by the system name ttt tells SHARPE that ttt is an MRGP model. Then a function devoted to read Markov-like system specications is called. This function rst reads all the edges into an edge array. In the subsequent lines, the MRGPs edges are specied in the form presented previously:
nodename1 edgetype nodename2 distribution
The nodename1 is the starting node and nodename2 is the destination node as in Markov and semi-Markov models. There is an edgetype between the two nodenames. 28
format 8 bind n 50 lambda 1 tau 1/3600 gamma 1/1000/3600 D 0.01 uD n/D end mrgp ttt 010 @ 001 001 @ 010 110 @ 101 101 @ 110 210 @ 201 201 @ 210 310 @ 301 301 @ 310 010 @ 110 110 @ 010 110 - 210 210 @ 110 210 - 310
310 @ 210 Erlang(n, uD) reward 010 1 110 1 210 1 * 001 * 101 * 201 * 301 * 310 end echo expr expr expr expr 1 1 1 1 1
exp(gamma) exp(tau) exp(gamma) exp(tau) exp(gamma) exp(tau) exp(gamma) exp(tau) exp(lambda) Erlang(n, uD) exp(lambda) Erlang(n, uD) exp(lambda)
Steady State Probabilities: prob(ttt,010), prob(ttt,001) prob(ttt,110), prob(ttt,101) prob(ttt,210), prob(ttt,201) prob(ttt,301), prob(ttt,310)
echo Steady State Throughput: expr exrss(ttt) end
Figure 3.8: SHARPE MRGP Specication for M/G/1/3 Breakdown Queue
29
@ stands for a Markov regenerative edge (GEN or competitive EXP) and edges (concurrent EXP) that are not regenerative. The last part of an edge is the distribution. By distribution we mean the conditional CDF F (t) of the ring time of the edge, which implies that given that at time 0 the process is at nodename1 and all other transitions are disabled, the probability that this transition res by time t. After each distribution is read, SHARPE will verify its correctness. There are several built-in distributions in SHARPE, including zero, inf, prob(p), exp(), gen, Erlang, etc. [25]. When the input of all edges is nished, a function is called to topologically sort the states. This function will process the duplicate edges, nd out absorbing states, and judge if the chain is an irreducible, phase-type, or acyclic one. If a keyword reward is specied, SHARPE will continue to read reward rates specied by the user. At this time, the system is fully dened and the user can ask for results.
3.2.4
Solver Kernel
We will concentrate on the MRGP steady-state solver kernel. The current version of the solver is intended to give steady state solutions of MRGP models. The users can use expr prob(mrgp name, nodename {;param list}) to get the steady state probability for node nodename of the MRGP model named mrgp name. expr exrss(mrgp name {;param list}) is used to calculate the expected steady-state reward rate value. For example, the 30
following two lines are used to solve the MRGP model and give out the steady-state probability of state 010 and the expected steady-state reward rate value.
expr prob(ttt, 010) expr exrss(ttt)
First, rebinding subroutine is invoked, where proper values are assigned to all the user dened variables. Then the MRGP kernel proceeds in the following sequence: 1. Sorting Edges. Determine (m), E (m), G (m) and m for each state m, and copies GEN distributions from edgeT structures to mnode infoT structures. 2. Computation of eQ(m)t . Construct the subordinated CTMCs and put the solutions into eQ(m)t array. 3. Computation of mn . Symbolically integrate the local kernel Emn (t). 4. Computation of One-Step Transition Matrix P . Generate the one-step transition matrix P of the EMC. 5. Computation of . Solve the linear equations (I P ) = 0 by SOR. 6. Computation of . Sum mn for all n to get m = E[S1 |Y0 = m]. 7. Computation of . Use k =
k k . r r r
8. Computation of pj . Calculate the steady state probabilities for all states, using Theorem 4. In the following we will describe the above modules in detail.
31
110 210 310 301

Figure 3.9: The subordinated CTMC of state 110 of M/G/1/3 Breakdown Queue; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions Sorting Edges At this time, most information of the MRGP is still in the structure edgeT. We need to sort them into the mnode infoT structure, nding out the states connected by immediate concurrent EXP transitions, competitive EXP transitions and GEN transition. Also we compute the total rate out for EXP transitions and copy the general distribution into mnode infoT from the structure edgeT for the convenience of later reference. It is not enough to only obtain immediate successor states (m), E (m), G (m). We need to obtain a complete set of {E},{G} (m) successor states in the subsequent operations, so a recursive function is called here to accomplish this. We use the M/D/1/3 example (Figure 3.5) to illustrate this. The MRGP (reachability graph) of M/D/1/3 is shown in Figure 3.6. Assume we need to nd (110), we start our search from state 110 (see Figure 3.9). State 210 is directly connected with it by a concurrent EXP edge, so state 32
101
201
210 310 301

Figure 3.10: The subordinated CTMC of state 210 of M/G/1/3 Breakdown Queue; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions 210 is an immediate (110) member of 110. Then we call this function recursively, state 210 is checked to see if there are any concurrent EXP distributed edges (see Figure 3.10). Since there is one such edge from state 210 to 310, so state 310 is included into (110) by manipulating the appropriate linked list. There is no concurrent EXP edges from state 310, program control returns to state 210. Since there are no additional concurrent EXP edges from state 210, program control is back to state 110. Now (110) is determined to be {210, 310}. Note that any state itself is a member of its own (), i.e., m (m). For simplicity, we do not include state m itself in the linked list for (m). Also SHARPE does not allow duplicate states in s. E is found at the same time. Computation of eQ(m)t Now that all the preparation work is done, we can begin to solve the model. As we introduced previously, the subordinated stochastic process is a CTMC and the innitesimal generator matrix is denoted by Q(m). If in state m, the GEN transition is not enabled, we do not need to compute eQ(m)t according to our algorithm. If the GEN transition is enabled but neither competitive EXP nor concurrent EXP is enabled, all the entries of Q(m) will be zero, so only [eQ(m)t ]mn = 1 for n = m, and 0 33
201
all other n. If none of the above conditions holds, SHARPE will construct a real CTMC submodel for state m of the MRGP model. As we described before, all (m) and E (m) nodes are included, and all EXP edges (both concurrent and competitive) starting from (m) nodes are included. The initial state of the CTMC is m. After this subordinated CTMC is solved according to its type (acyclic, phase-type or irreducible), the CDF is copied into corresponding exponomial eQ[m][n]. This computation is done using the phase-type CTMC, semi-symbolic algorithm in SHARPE [23]. The results are in the form of exponomials. Note that an index translation is necessary because a node may have dierent indices in the MRGP model and the CTMC submodel. Computation of mn The algorithm of calculating mn is as stated: mn =
0
Emn (t)dt
and Emn (t) is previously given in Theorem 1. When G(m) = , Emn (t) = mn em t , so mn = mn /m . When G(m) = {g}, Emn (t) = [eQ(m)t ]mn (1 Fg (t)) for n (m) and 0 for n (m), so SHARPE just symbolically integrates it, subtracts the value at 0 from that at , then puts the result into mn . For the algorithm for symbolic integration of exponomials, refer to [25]. Computation of One-Step Transition Probability Matrix P The next step is to compute the EMC one step transition probability matrix. SHARPE uses sparse matrix operations which are proven to be space and time ecient. Matrices are stored in Compressed Sparse Row Format (CSRF)[24, 8] - also called row 34
pointer column index format. There are three arrays, ip[], jp[] and nonzero[] associated with each sparse matrix. All the nonzero entries of the matrix are stored in array nonzero[] from the rst row (0) to the last row (n 1). ip[] is the row pointer because row i starts from ip[i]th element of nonzero[], and the total number of nonzero elements is in ip[n]. jp[], which is called the column index, has the same length of nonzero[]. For any entry nonzero[i], we can nd its column index at jp[i]. The algorithms for large sparse matrix operations have been studied extensively, and the CSRF storage format can save much computation time and memory space. Each element pij of the one step transition probability matrix P means that given the current state is i, the probability to reach state j after next transition. The algorithm used in SHARPE is setting up a loop from the rst node to the last one, checking the node ms outgoing probabilities, and putting them in row m. pij is calculated as stated in Theorem 4: When G(m) = , Pmn = (m, n)/m if m = 0 and 0 otherwise. When G(m) = {g}, Pmn = if n E but n G (m);
m (m) 0 0
[eQ(m)x ]mn dFg (x)
Pmn = if n G (m) but n E (m); Pmn =

m (m) 0
[eQ(m)x ]mm dFg (x)(m , n)
[eQ(m)x ]mm dFg (x)(m , n) +
[eQ(m)x ]mn dFg (x)
if n G (m) and n E (m); 35
Pmn = 0 if n G (m) and n E (m). SHARPE has a function designed to do this type of special symbolic integration
0
f (x)dg(x), where both f (x) and g(x) are exponomials. Note that f (x) is an ex-
ponomial, if g(x) is an exponomial so is dg(x)/dx, hence the integrand f (x)dg(x)/dx is also an exponomial. As we stated, integration of exponomials can be done symbolically. In some cases the summation over all (m) states is required, and this is the reason why we used a recursive algorithm to determine the complete (m) for every state at the beginning. There is a counter pointing to the next available place of the array nonzero[], and all the diagonal elements are not included because of the reason we discuss soon. Computation of The EMC steady state probability vector satises the equation = P. SHARPE has a Successive Over-relaxation (SOR) [10, 21] subroutine designed to solve linear equations. This function can solve linear equations in the form of Ax = 0. The matrix A is in the CSRF format without diagonal entries. The subroutine will compute the diagonal elements as
j=i
Aij . The reasons are:
for steady-state solutions of irreducible DTMCs, we need to solve (I P )T T = 0, and the sum of each row P is one, so the sum of each row of (I P ) is zero, the diagonal entries can be computed as 36
j=i
Aij , and it has more advantages
to put the diagonal elements separately due to the intrinsic property of SOR algorithm. for steady-state solutions of irreducible CTMCs, we need to solve QT T = 0, where Q is the innitesimal generator matrix. As we know, the diagonal elements are
j=i
(i, j).
After transposing the matrix in nonzero[], we end up with (I P )T and feed it into the SOR subroutine. The steady state vector is the output after the specied SOR tolerance is met. Computation of As dened earlier, m is the expected sojourn time of state m of the EMC, i.e., m = E[S1 |Y0 = m] = =
n(m) 0 n(m) 0
(3.13) [eQ(m) ]mn ddFg (x) (3.14) (3.15)
mn .
Since we have all mn ready, the computation of m is straightforward. Computation of k =

k k . r r r
The only thing we need to note is that the summation in the
denominator is over all MRGP states and not just over (m). To understand the meaning of better, we consider a large number N of EMC state changes. N k is approximately the mean number of visits to the EMC state k in steady state in the N state changes. Since k is the mean sojourn time for each visit to the EMC state k, N k k is approximately the mean time the MRGP spends in state k in steady 37
state, hence
N r r is the total time spent in the N state changes, and k is the
fraction of that time the MRGP spends in the EMC state k. Computation of pj Now we are ready to determine the steady state probability for each state: pj = lim P {Z(t) = j|Y0 = m}
t
(3.16) (3.17) (3.18)
= =
k kj k k k k kj . k
The results will be stored in each states structure mnode infoT for future reference.
38
Chapter 4 Applications of MRGP Solver

As we stated, systems that can be modeled by MRGP and MRSPN are abundant in real world. There are many examples in literature [7, 14, 22, 1, 26, 6]. In this chapter, we study several MRGP model applications which will be solved by SHARPE MRGP steady state solver described in the previous chapter. Our goals are: To illustrate the necessity of MRGP modeling by comparing the results of MRGP with other models. To demonstrate the techniques of constructing and solving MRGP models using SHARPE. To validate our SHARPE MRGP solver by comparing against known closedform solutions. We rst examine the M/G/1 breakdown queue described in the previous chapter. The throughput, loss rate and parametric sensitivities of this model are studied. Then we focus on the second application, Detection and Restoration Times in Communication Networks Error Recovery. Several other examples are also given after it. All the SHARPE model specication codes are listed, and theoretical closed-form solutions and SHARPE numerical results are compared.
4.1
Application 1: M/G/1 Breakdown Queue
The M/G/1 breakdown queue was used as an example in the description of MRSPN and MRGP in the last chapter. A number of practical problems can be modeled as 39
M/G/1 queues. For instance, consider an ATM switch processing xed-length cells. The interarrival time of cells is exponentially distributed with rate . Since all the cells have the same length, it takes the switch a denite amount of time D to process a single cell. Assuming the queue length is 3 (the queue length is usually much larger in real life), and First Come First Serve (FCFS) scheduling policy, this is a typical M/D/1/3 queueing problem in which the ATM switch is the server with deterministic service time, the arrival process is Poisson with parameter and the queue length is 3. Also in this model, the ATM switch is subject to failure in which the inter-failure time is also exponentially distributed with rate , and the ATM switch recovery time is exponentially distributed with rate . Note that when the switch is repaired after a failure, the processing of the unnished job is started over, i.e., it still takes the ATM switch time D to process it. To specify the model, the rst problem is how to generate the DET distribution in SHARPE, since, as we have seen, SHARPE provides only exponomial distributions. However, it is recognized that Erlang distribution serves as a good approximation of deterministic distribution [27, 13]. The n-stage Erlang distribution is the convolution of n mutually independent exponential distributions, each of which has the rate of . Let X be the random variable of service time, the pdf (probability density function) and CDF (cumulative distribution function) of an n-stage Erlang distribution are given as: fX (t) = and FX (t) = 1 et
n1 i=0
tn1 n t e (n 1)!
(t)i . i!
The mean is E[X] = n/ and the variance is 2 = n/2 . 40
Erlang with mean 1 1
0.9
0.8
0.7
0.6 CDF
0.5 n=1 n=2
0.4
0.3
0.2
n=10
0.1
n=50
DET
0.2
0.4
0.6
0.8
1 Time
1.2
1.4
1.6
1.8
Figure 4.1: Erlang Distribution; 1-stage, 2-stage, 10-stage, 50-stage Erlang Distribution and the Deterministic Distribution Figure 4.1 shows how a deterministic distribution can be approximated by Erlang distributions of dierent number of stages. We notice that increasing the number of stages provides higher accuracy, given the mean E[X] = n/ is constant. A 50-stage Erlang distribution serves as a good approximation of the deterministic one in this example. In SHARPE, Erlang is one of the builtin distributions and the users can specify an n-stage Erlang distribution with parameter mu by Erlang(n, mu). Two performance metrics are of importance in this model (Figure 3.5 and Figure 3.6): throughput and loss rate. In this example, the steady state throughput is given by: E[T ] = [p(010) + p(110) + p(210)]. So the reward rates are assigned as: r010 = r110 = r210 = and 0 for the other states. Let = 1, we can get the normalized throughput (probability of acceptance of job) directly from the reward function. 41
We use the following parameters to obtain numerical results: = 1 second1 , normalized arrival rate; D = 0.01 second, service time for a single job; = 103 hour1 , server failure rate; = 1 hour1 , server repair rate using 50-stage Erlang to approximate the deterministic service time distribution. The SHARPE specication of M/D/1/3 model is listed in Figure 3.8: We obtain the steady state probabilities of all states as below:
* Steady State Probabilities: prob(ttt,010): prob(ttt,001): 9.89015575e-01 9.89015575e-04
------------------------------------------prob(ttt,110): prob(ttt,101): 9.93418788e-03 9.93418788e-06
------------------------------------------prob(ttt,210): prob(ttt,201): 5.10580732e-05 5.10580732e-08
------------------------------------------prob(ttt,301): prob(ttt,310): 0.00000000e+00 1.78052566e-07
* Steady State Throughput: -------------------------------------------
42
exrss(ttt):
9.99000821e-01
Therefore the probability of acceptance of job (normalized throughput) is approximately 0.999000821. The loss probability could be obtained by assigning the reward rates of 1 to states 001, 101, 201, 310 and 301 because in these states, either the queue is full or the server is down (and hence no new jobs will be accepted) so that any new arriving job is dropped. With the same parameters listed above, the loss probability is 9.99179052 104 . We tabulate the results in Table 4.1. As we can see in the table, the service time D is the dominant factor deciding the loss probability when the server failure rate is small. In this case, the job losses are mainly due to the full queue. When is large enough, the server goes down frequently, hence new jobs cannot enter the queue even if it is not full. When D is small, the loss probability is low because the jobs can hardly stay in the queue before they are processed. When the service time becomes longer, higher server failure rate gives higher loss probability. When D is considerably larger, the loss probability curve saturates, that is, the loss probability is close to 1 and few jobs can enter the crowded queue. Table 4.1: M/D/1/3 Loss Probability vs. Service Time and Server Failure Rate Loss Prob. = 103 hour1 = 101 hour1 = 100 hour1 = 10 hour1 D = 102 s 9.9918 104 9.0909 102 5.0000 101 9.0909 101 D = 101 s 1.1909 103 9.1084 102 5.0010 101 9.0911 101 D = 1s 1.7892 101 2.5283 101 5.8908 101 9.2534 101 D = 10s 8.2791 101 8.4390 101 9.1540 101 9.8494 101
43
4.2
Application 2: Detection and Restoration Times in Communication Networks Error Recovery

System Description
4.2.1
Many network switch models assume detections and restorations are perfect and instantaneous for simplicity. This assumption is not guaranteed in real life systems. Logothetis and Trivedi [15, 16] developed Markov Regenerative Reward Models (MRRM) which included the detection and restoration eects in network recovery. They studied the 1:N protection switching system, in which N transmission channels share one common protection channel, and each channel may have a queue to store incoming data (Figure 4.2). When a channel failure occurs, it will take Site A a deterministic time D to detect it, and once a failure is detected, the restoration system redirects data of that channel to the spare channel and place the bad channel in repair. The time is deterministic with value R. We will apply our MRGP model to this system and determine the loss rates for dierent detection and restoration times. In [15, 16], closed-form solutions are available and we will compare our SHARPE numerical results with these.
4.2.2
The MRGP Model
As we stated, the fault detection time is dened as the time it takes the transmitting end (site A) to detect the failure. When the channel failure is detected, the spare channel is put into use and data goes to the corresponding switches. The time involved in this operation is called the restoration time. When the failed channel is repaired, services carried by the spare channel are switched back to their original channel. If a second failure occurs before the repair of the failed channel is nished, the system fails. Because usually we expect the failure rate to be much smaller than 44
Site A 0 1 2
Site B
N-1
Spare Channel
Figure 4.2: 1:N Protection Switching System; N active channels and one spare channel the repair rate, the system does not go down frequently. When the detection and restoration times are negligible, the system can be characterized by a continuous-time Markov chain shown in Figure 4.3. State S2 represents the system status in which all N + 1 channels are functioning properly, state S1 represents the state that N channels are functioning and one channel has failed, and state S0 means that two channels have failed (and hence, the system is down).
N S2 S1
N S0
Figure 4.3: CTMC with zero detection and restoration times Because in state S2 and S1, there are N channels working, the transition rates from S2 to S1 and from S1 to S0 are both equal to N , assuming that the spare channel does not fail. Suppose there is only one repair facility, so the repair rates are , from state S0 to S1 and from state S1 to S2. Figure 4.4 shows the MRGP model of the 1:N Protection Switching System with 45
D2 1/D N S2
R2
1/R S1
D1
1/D S0
Figure 4.4: MRGP with deterministic detection and restoration times; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions deterministic detection and restoration times. System is in state S2 if all the N channels are functioning properly. If any of these N channel fails, it takes the system an exact amount of time D to detect it, which is represented by the transition from state D2 to R2 with a deterministic transition time D. Then the restoration begins and the system moves to state R2 for restoration. After a nite restoration time R, the failed channel is replaced by the spare one, and the repair for the failed channel begins. The system is now in state S1. The channel repair time is assumed to be exponentially distributed with rate . If before the repair is nished, another channel failure occurs, the system will jump to state D1 to detect it. If before this failure is detected, the rst failed channel is repaired, the system will return to state D2, without stopping the detection of the second failure. This arc (from state D1 to D2) is a concurrent EXP arc with respect to the DET detection transition. This is why the underlying stochastic process is an MRGP and not a semi-Markov process. If the second failure is detected before the repair of the rst failure, there is no spare channel left to switch, which will mean that the system will fail leading it to state S0. Note in this case we assume that the second failure does not occur during the rst ones detection and restoration process. This assumption is reasonable because the failure rates are negligibly small compared with the detection and repair rates.
46
4.2.3
Reward Function for Loss Rates
The main purpose of this model is to study how the detection and restoration times aect the loss rate. We consider a simple case rst. Assume site A is accepting data for all channels at the rate B. When the queue length of each channel of site A is zero (no queue is present), the reward rate values for the MRGP states are: rD2 = rR2 = rD1 = rS0 = B and 0 for the other states. Denoting pm as the steady state probability of state m, the expected steady state reward rate is: r = B(pD2 + pR2 + pD1 + pS0 ) When the queue length is greater than 0, the reward functions are given in [15, 16]: rD2 = B rv + (1 rv )
FW (x)dFD (x)

(4.1)
rR2 = B
L+1 i=0
j=Li+2 r
A(j, r)dFR (r) li
(4.2)
where rv is the probability that a packet arrives but the queue is full. FW (t) =
L (ti )]i (ti ) (1)i e u(ti ) i! i=0 i L i (i) ei (1) i! i=0
, if 0 t L , and 1 otherwise, FR (r) and FD (r) are the
restoration time distribution and detection time distribution, respectively. A(j, r) is the probability of j arrivals in time r, is the mean service time. li (i = 0, 1, . . . , L+1) is the steady-state queue length pmf (probability mass function) at each arrival epoch. The expected steady state reward rate is given as below: r = Brv (pS2 + pS1 ) + rD2 pD2 + rR2 pR2 + BpS0 + BpD1
47
Table 4.2: No Queueing Closed-Form Solution: Loss Rates vs. Detection Times (D) and Restoration Times (R); N = 1 Loss Rate R=20ms R=60ms R=100ms R=1s D=50ms 2.04 1011 3.15 1011 4.27 1011 2.92 1010 D=100ms 3.43 1011 4.54 1011 5.65 1011 3.06 1010 D=1s 2.84 1010 2.95 1010 3.06 1010 5.56 1010
Table 4.3: No Queueing Closed-Form Solution: Loss Rates vs. Detection Times(D) and Restoration Times(R); N = 8 Loss Rate R=20ms R=60ms R=100ms R=1s D=50ms 2.19 1010 3.08 1010 3.97 1010 2.40 109 D=100ms 3.3 1010 4.19 1010 5.08 1010 2.51 109 D=1s 2.33 109 2.42 109 2.50 109 4.51 109
4.2.4
Numerical Results
Closed-form solution and numerical results are given in [15, 16]. We will use SHARPE to solve this problem again and see how well the SHARPE MRGP numerical solution agrees with the closed-form one. The parameters chosen are: B = 1 hour1 , used to get normalized loss rates; N = 1 or N = 8, one or eight active channels are present; = 106 hour1 , failure rate of a single channel; = 1 hour1 , repair rate. The SHARPE MRGP code is listed in Figure 4.5:
48
format 8 bind n 50 bind N 8 bind S 3600 bind bind bind bind D 1/S R 1/S uD n/D uR n/R
D1 @ S0 Erlang(N,uD) S0 @ S1 exp(mu) D1 - D2 exp(mu) S1 @ S2 exp(mu) reward D2 1 R2 1 D1 1 S0 1 end echo expr expr expr Steady State Probabilities: prob(demitris, S2), prob(demitris, S1) prob(demitris, S0), prob(demitris, D2) prob(demitris, D1), prob(demitris, R2)
bind lambda 0.000001 bind mu 1
mrgp demitris S2 D2 R2 S1 @ @ @ @ D2 R2 S1 D1 exp(N*lambda) Erlang(n,uD) Erlang(n,uR) exp(N*lambda)
echo Steady State Loss Rate: expr exrss(demitris) end
Figure 4.5: SHARPE MRGP Specication File of 1:N Switching System
49
Table 4.4: No Queueing SHARPE Results: Loss Rates vs. Detection Times(D) and Restoration Times(R); N = 1 Loss Rate D=50ms D=100ms D=1s 11 11 R=20ms 2.044016 10 3.432295 10 2.842498 1010 11 11 R=60ms 3.154338 10 4.542617 10 2.953530 1010 R=100ms 4.264836 1011 5.653114 1011 3.064580 1010 R=1s 2.925752 1010 3.064580 1010 5.563848 1010
Table 4.5: No Queueing SHARPE Results: Loss Rates vs. Detection Times(D) and Restoration Times(R); N = 8 Loss Rate D=50ms D=100ms D=1s 10 10 R=20ms 2.195197 10 3.305812 10 2.329982 109 R=60ms 3.083448 1010 4.194064 1010 2.418807 109 R=100ms 3.971840 1010 5.082455 1010 2.507646 109 R=1s 2.396585 109 2.507646 109 4.507047 109
Table 4.2 and Table 4.3 shows the loss rates versus detection and restoration times from both closed-form solutions. SHARPE MRGP numerical results are given in Table 4.4 and Table 4.5. By comparing Table 4.2 with Table 4.3, and Table 4.4 with Table 4.5, we can see SHARPE MRGP results agree very well with the closedform solutions. This also shows that Erlang distribution is a good substitute for deterministic distribution. In this problem, the variance of the 50-stage Erlang is 2 = D2 /n 1.55 109 .
4.3
Application 3: Parallel System with a Single Repairman
Consider a system consisting of two parallel components A and B. The failure rates of A and B are A and B , respectively. There is a single repairman with the First Come First Served (FCFS) scheduling policy for repair. As soon as either one of A 50
or B fails, the repairman begins to repair it if not busy. If the other component fails before the repair work on the rst one is completed, the second component has to wait until the repairman is free. It will take the repairman exactly DA and DB time to repair component A and B, respectively. The MRSPN model is shown in Figure 4.6. Places A up and B up stand for the conditions that A and B are up, respectively, and A down and B down are for conditions of component A and B not functioning, respectively. If there is a token in Repair idle place, the repairman is available. If either A or B fails now, the corresponding immediate transition is enabled and a token is deposited into place Repair A or Repair B, and the corresponding DET transition is enabled. If before the repair is done, the other component fails, a token will stay in place A down or B down until the repairman comes back to Repair idle place. After a repair is done, one token will be put back into the place A up or B up and one token into the place Repair idle.
A_up
DA Repair_A
Rep_A
Rep_B
DB Repair_B
B_up
A Repair_idle A_down
B_down
Figure 4.6: MRSPN Model for Parallel System with Single Repairman Problem; Thick black bars represent GEN transitions, thick white bars represent EXP transitions, thin bars represent immediate transitions Figure 4.7 is the underlying MRGP of the MRSPN model. System is in state 1 if both A and B are up (and the repairman is free). Component A can fail at the rate of A , and reach state 2. It will take the repairman DA time units to repair the component and bring the system back to state 1. If component B goes down 51
5 DB A 2 DA B 4
Figure 4.7: Underlying MRGP of the Parallel System with Single Repairman Problem;Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions during repair time of component A, the system jumps back to state 4, and while the repair action on component A continues. Once A is repaired, and B is still down, the marking process is in state 3. B begins to be repaired with a repair duration DB , but A can fail again in the mean time. This last condition is encoded in state 5. As shown in Figure 4.7, EXP transitions from state 2 to state 4 and from state 3 to state 5 are concurrent EXP transitions, since the rings of these two transitions do not disable the enabled GEN transition RepA or RepB . Theoretical closed-form solution is given in [2]. We compute the numerical results using SHARPE in this thesis. The model specication le is listed in Figure 4.8. To get the system availability, we assign the reward rate 1 to states 1, 2 and 3, because in these three states, at least one of A and B is functioning properly. By using the parameters listed in [2] (a = b = 0.01, Da = Db = 5), we get:
* Steady State Probabilities: prob(repair, 1): prob(repair, 2): 9.04877585e-01 4.63465668e-02
A B 1 DB DA 3
52
format 8 bind n 25 DA 5 DB 5 lambdaA 1/100 lambdaB 1/100 uDA n/DA uDB n/DB end
4 @ 3 Erlang(n, uDA) 3 - 5 exp(lambdaA) 5 @ 2 Erlang(n, uDB) reward 1 1 2 1 3 1 end echo expr expr expr Steady State prob(repair, prob(repair, prob(repair, Probabilities: 1), prob(repair, 2) 3), prob(repair, 4) 5)
mrgp repair 1 @ 2 exp(lambdaA) 2 @ 1 Erlang(n, uDA) 1 @ 3 exp(lambdaB) 3 @ 1 Erlang(n, uDB) 2 - 4 exp(lambdaB)
echo Steady State Availability: expr exrss(repair) end
Figure 4.8: SHARPE MRGP Specication of Parallel System with A Single Repairman
53
------------------------------------------prob(repair, 3): prob(repair, 4): 4.63465668e-02 1.21464079e-03
------------------------------------------prob(repair, 5): 1.21464079e-03
* Steady State Availability: ------------------------------------------exrss(repair): 9.97570718e-01
The system steady state availability is 0.99757 (unavailability is 0.00243), and the closed-form solutions in [2] gives approximately 0.99760. If we change the model to an SMP, i.e., when a second failure occurs during the repair of the rst one, the repair of the rst starts over (the timer is reset). The two concurrent EXP transitions (from state 2 to 4, and from state 3 to 5) become competitive EXP transition now. With the same parameters chosen, we get:
* Steady State Probabilities: prob(repair, 1): prob(repair, 2): prob(repair, 3): 9.02886382e-01 4.62445801e-02 4.62445801e-02
------------------------------------------prob(repair, 4): prob(repair, 5): 2.31222901e-03 2.31222901e-03
* Steady State Availability: ------------------------------------------exrss(repair): 9.95375542e-01
54
The system unavailability is 0.00462, almost twice as large as in the previous case. For mission-critical systems, this is not a negligible error.
4.4
Application 4: Warm Spare with Single Repairman
The component A of the system of interest has a warm spare A , that is A will fail at the rate and the spare fails at a lower rate . Similar to the parallel system discussed in the previous section, there is a shared single repairman with the First Come First Served (FCFS) scheduling policy. If either of the active or spare component fails, the repair begins immediately, and the spare one is switched to active instantaneously if necessary. If a second unit fails before the rst repair is completed, it has to wait until the repairperson nishes working on the rst unit. The MRSPN model is shown is Figure 4.9. At the beginning both the active component and spare component are available, so there is one token in each of the places Active up and Spare up. If the active component fails (depositing a token in the place Active down), the immediate transitions switch res, so that one token is put back in the place Active up and one token is deposited in place Repair. This means that the spare one is switched to active, and the failed one is being repaired. It takes the repairperson R units of time to bring the bad component back to place Spare up. If before the repair is nished the active component goes down, the immediate transition will not re because place Spare up is empty, so the repair for it cannot begin until the rst one is put back into service. Figure 4.10 is the reachability graph of the MRSPN model. Its underlying MRGP shown in Figure 4.11 has the ve tangible states of the reachability graph. State 1 (state 11000 in the reachability graph) is the initial state where both the active and 55
Active_up
Spare_up
a_fail
s_fail
s_repair a_repair R R
Repair Active_down Spare_down switch
Figure 4.9: MRSPN Model for Warm Spare System with Single Repairman Problem; Thick black bars represent GEN transitions, thick white bars represent EXP transitions, thin bars represent immediate transitions spare components are in good status. System goes to state 2 (10001) if the active one fails and the spare one is up. If another failure occurs in state 2, the system jumps to state 4 (00101), without stopping the repair of the rst failed component. This EXP transition is a concurrent one. The transition from state 1 to state 3 (10010) means that the spare component fails while the active one is still working. Then the failed spare component is put into repair. If before it is brought back to work, the active one fails, the new state is state 5 (00110). This transition also does not interfere with the previous repair work, hence it is a another concurrent EXP transition. The SHARPE MRGP specication le is shown in Figure 4.12: The parameters are chosen the same as [2]: 25-stage Erlang for the deterministic repair time, = 0.01, = 0.001, R = 5. SHARPE result is given as:
* Steady State Probabilities: prob(warm, 1): prob(warm, 2): 9.45343077e-01 4.86550658e-02
-------------------------------------------
56
a_repair
a_repair
11000
a_fail
01100
switch
10001
a_fail
00101
s_fail
s_repair
s_repair
10010
a_fail
00110
Figure 4.10: Reachability Graph of Warm Spare System with Single Repairman MRSPN Model; Ovals represent tangible states, and rectangles represent vanishing states. Markings are labeled by a 5-tuple (#(Active up), #(Spare up), #(Active down), #(Spare down), #(Repair)).
1 R 3 R R 5 2
4 R
Figure 4.11: Underlying MRGP of Warm Spare System with Single Repairman Problem; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions
prob(warm, 3): prob(warm, 4):
4.60600228e-03 1.27514144e-03
------------------------------------------prob(warm, 5): 1.20713111e-04
* Steady State Availability: ------------------------------------------exrss(warm): 9.98604145e-01
The analytical availability of [2] is approximately 0.99860.
57
format 8 bind n 25 R 5 lambda 0.01 gamma 0.001 uR n/R end
3 @ 1 Erlang(n, uR) 3 - 5 exp(lambda) 5 @ 2 Erlang(n, uR) reward 1 1 2 1 3 1 end echo expr expr expr Steady State Probabilities: prob(warm, 1), prob(warm, 2) prob(warm, 3), prob(warm, 4) prob(warm, 5)
mrgp warm 1 @ 2 exp(lambda) 2 - 4 exp(lambda) 4 @ 2 Erlang(n, uR) 2 @ 1 Erlang(n, uR) 1 @ 3 exp(gamma)
echo Steady State Availability: expr exrss(warm) end
Figure 4.12: SHARPE MRGP Specication of Warm Standby System with A Single Repairman
4.5
Application 5: Vacation Queue
Suppose we have a server and three external clients and each client issues a job at a constant rate . The service time of an external job is exactly F time units, and after the server nishes a job, it takes a vacation and processes its local jobs for exactly G time units. The MRSPN model is shown in Figure 4.13 [17, 4]. Place pSource holds the potential jobs, each of which can re at the rate , and enter place pWait. If the server comes back from vacation, i.e., transition vacation res and a token is moved from place pVacation to place pActive, the service of a job begins. After time duration F elapses, the transition service res (which means the job is nished), and a token is put back into the place pSource, and the server takes a vacation, that is, a token jumps from place pActive to place pVacation.
58
pSource
pWait
pActive
pVacation
# service
vacation
Figure 4.13: MRSPN Model for Vacation Queue; Thick black bars represent GEN transitions, thick white bars represent EXP transitions, thin bars represent immediate transitions The underlying MRGP of the MRSPN model is given in Figure 4.14. Similar to
3 3001 G F 2101 G F 2 1201 G F 0301 G 0310
3010
2110
1210
Figure 4.14: Underlying MRGP of MRSPN Model for Vacation Queue; Each marking is denoted by a 4-tuple (#(pSource),#(pWait),#(pActive), #(pVacation)); Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions our previous examples, there are several concurrent EXP transitions in this MRGP, because the arrivals of external jobs do not stop the server processing its local (DET transition vacation) or external (DET transition service) jobs. Assume that = 1 minute1 (normalized job arrival rate), the service time of an external job is F = 2 minutes, and the vacation lasts G = 4 minutes. In order to get the throughput of the external jobs we use (Expected # of jobs in pSource) . The reward assignment is listed as in the SHARPE MRGP specication le in Figure 4.15: The SHARPE output is:
* Steady State Probabilities:
59
format 8 bind n 50 lambda 1 F 2 G 4 uF n/F uG n/G end mrgp 3001 2101 1201 3010 2110 1210 3001 2101 1201 0301 2110 1210 0310 vacation - 2101 exp(3*lambda) - 1201 exp(2*lambda) - 0301 exp(lambda) @ 2110 exp(3*lambda) - 1210 exp(2*lambda) - 0310 exp(lambda) @ 3010 Erlang(n, uG) @ 2110 Erlang(n, uG) @ 1210 Erlang(n, uG) @ 0310 Erlang(n, uG) @ 3001 Erlang(n, uF) @ 2101 Erlang(n, uF) @ 1201 Erlang(n, uF)
reward 3001 3 3010 3 2101 2 2110 2 1201 1 1210 1 * 2101 * 2110 * 1201 * 1210 * 0301 * 0310 end echo expr expr expr expr
1 1 2 2 3 3
Steady State Probabilities: prob(vacation, 3001), prob(vacation, prob(vacation, 2101), prob(vacation, prob(vacation, 1201), prob(vacation, prob(vacation, 0301), prob(vacation,
3010) 2110) 1210) 0310)
echo Steady State Throughput: expr exrss(vacation) end
Figure 4.15: SHARPE MRGP Specication of Vacation System
60
prob(vacation, 3001): prob(vacation, 3010):
2.16767670e-09 0.00000000e+00
------------------------------------------prob(vacation, 2101): prob(vacation, 2110): 2.52005529e-04 1.48759904e-07
* Steady State Throughput: ------------------------------------------exrss(vacation): 1.66790611e-01
The normalized throughput with respect to is 0.1667906 job/minute. The expected waiting time for each external job is given by Littles Formula [27]: E[QueueLength] = E[ArrivalRate] E[ResponseT ime] The E[ArrivalRate] is just the throughput, and the E[QueueLength] can be found by dening reward rates as r2101 = r2110 = 1, r1201 = r1210 = 2, r0301 = r0310 = 3 and 0 for other states. With the same parameters listed above, we get the expected steady-state queue length of 2.833209388 job, hence the expected response time for each external job is 16.98662507 minute. 61
If we assume that the vacation time is exponentially distributed with mean G, the throughput is 0.4809439895 job/minute, the mean queue length is 2.519056010 job, and the mean response time is 5.2377326 minute, which is much smaller than the previous one.
4.6
Application 6: Bulk Service System
A xed overhead is commonly associated with each job, which has a variable size, in many realistic systems such as ATM switches. This phenomenon is the reason that the system throughput varies dramatically over time due to the variable job trac. A common method to maximize the performance and eciency of the system is to set up a buer, accumulating a certain number of jobs before processing them with only one overhead cost. A timer is also used to ensure a relatively low waiting time, especially for low arrival rate case. Also, the queue is emptied when high priority job arrives. An ATM example is given in [28]. The MRGP model is given in Figure 4.16. When the rst job arrives, the timer is started, i.e., the DET transition with duration D is enabled. When a high priority job arrives, or the queue is full or the timer expires, all the jobs in the queue are processed and the timer is disabled. Assume the arrival processes of normal jobs and high priority jobs are Poisson process with rates and , respectively. The timer res after a preset time D has elapsed from the instant it is started. The MRGP code shown in Figure 4.17 assumes: N = 8, = 1 job/minute, = 101 job/minute and timer length D = 50 minute [28]. The mean queue length is given by reward rates ri = i, i = 1, . . . , 7. SHARPE output is: 62
0 + D N-1 D
1 D
2 D D
D D
4 7 6 5
Figure 4.16: MRGP Model of the Bulk Service System; Queue length is N; Solid thin arcs are competitive EXP transitions, dashed thin arcs are concurrent EXP transitions and solid thick arcs are GEN transitions
* Steady State Probabilities: prob(bulk, 0): prob(bulk, 1): 1.70403666e-01 1.54912411e-01
------------------------------------------prob(bulk, 2): prob(bulk, 3): 1.40829468e-01 1.28026781e-01
------------------------------------------prob(bulk, 4): prob(bulk, 5): 1.16387975e-01 1.05807263e-01
-------------------------------------------
63
format 8 bind n 50 S 10000 lambda 1*S mu 0.1*S D 50/S uD n/D end mrgp bulk 0 @ 1 exp(lambda) 1 - 2 exp(lambda) 2 - 3 exp(lambda) 3 - 4 exp(lambda) 4 - 5 exp(lambda) 5 - 6 exp(lambda) 6 - 7 exp(lambda) 7 @ 0 exp(lambda) 1 2 3 4 5 6 7 @ @ @ @ @ @ @ 0 0 0 0 0 0 0 Erlang(n, Erlang(n, Erlang(n, Erlang(n, Erlang(n, Erlang(n, Erlang(n, uD) uD) uD) uD) uD) uD) uD)
1 @ 0 exp(mu) 2 @ 0 exp(mu) 3 @ 0 exp(mu) 4 @ 0 exp(mu) 5 @ 0 exp(mu) 6 @ 0 exp(mu) 7 @ 0 exp(mu) reward 1 1 2 2 3 3 4 4 5 5 6 6 7 7 end echo expr expr expr expr Steady State Probabilities: prob(bulk, 0), prob(bulk, 1) prob(bulk, 2), prob(bulk, 3) prob(bulk, 4), prob(bulk, 5) prob(bulk, 6), prob(bulk, 7)
echo Steady State Queue Length: expr exrss(bulk) end
Figure 4.17: SHARPE MRGP Specication of Bulk System
64
prob(bulk, 6): prob(bulk, 7):
9.61884191e-02 8.74440163e-02
* Steady State Queue Length: ------------------------------------------exrss(bulk): 3.00447853e+00
Hence, the mean response time for normal jobs is: E[QueueLength] 3.00447853 = = 3.00447853 minute ArrivalRate 1
E[R] =
This mean response time agrees with Wangs results well [28].
4.7
Application 7: Software Rejuvenation
A gradual performance degradation is observed during the execution of a software application and eventually the system hangs up or crashes. This phenomenon is called software aging. For a system with high availability requirements such as a banking system, this is not acceptable. Software rejuvenation, one of the preventive techniques, could be used to minimize the system unavailability. In [7], Garg et. al presented a quantitative analysis of software rejuvenation using MRSPN and MRGP. Figure 4.18 shows the MRGP model of the system. We assume that the rejuvenation is performed periodically with a xed time interval and all other transitions are exponentially distributed. Initially the software is in a robust state which is denoted by 1. This implies that the failure probability of the system is zero. As it is running, it may age at rate 1 and enter state 2, where the system can fail with rate 2. If the system goes down from state 2, which is an unexpected failure, the repair rate is 4. 65
When the system is in the robust state (state 1), the rejuvenation could start and the system is brought down for restart. This state is denoted by 3. The restart rate from state 3 is 3. Similarly, the rejuvenation could be performed when the system is in state 2 by entering state 5. The restart rate from state 5 is 5.
4 4 1
1 5
3 3
2 2
Figure 4.18: MRGP Model of the Software Rejuvenation Analysis The local kernel E(t) is given as: E11 (t) = e1 t , 0 < t < E12 (t) = (e2 t e1 t )1 /(1 2 ), 0 < t < E33 (t) = e3 t , E44 (t) = e4 t , E55 (t) = e5 t . and all other Eij (t) = 0. The one-step transition probabilities of the EMC are: P13 = e1 , P14 = 1 + 1 1 e1 + e2 , 1 2 2 1 66
P15 =
1 (e2 e1 ), 1 2
P24 = 1 e2 , P25 = e2 . and all other Pij = 0. The parameters we use are 1 = 1/240 hour1 2 = 1/2160 hour1 3 = 6 hour1 4 = 2 hour1 5 = 6 hour1
Clearly, the system is down if it is in state 3, 4 or 5. Hence, in order to determine the steady-state unavailability, we assign reward rates of 1 to states 3, 4 and 5, and 0 to all the other states. The SHARPE MRGP model specication code is listed in Figure 4.19. Note a scalar S is used to lower the variance of the Erlang distribution. The closed-form solution of the system unavailability is 5.777358 104 and the SHARPE output is shown below:
* Steady State Probabilities: prob(garg, 1): prob(garg, 2): 5.49044657e-01 4.50376424e-01
------------------------------------------prob(garg, 3): 1.28135581e-04
67
format 8 bind S 10000 n 50 D 336/S uD n/D lambda1 lambda2 lambda3 lambda4 lambda5 end S/240 S/2160 6*S 2*S 6*S
2 @ 4 exp(lambda2) 4 @ 1 exp(lambda4) 2 @ 5 Erlang(n,uD) 3 @ 1 exp(lambda3) 5 @ 1 exp(lambda5) reward 3 1 4 1 5 1 end echo expr expr expr Steady State Probabilities: prob(garg, 1), prob(garg, 2) prob(garg, 3), prob(garg, 4) prob(garg, 5)
mrgp garg 1 - 2 exp(lambda1) 1 @ 3 Erlang(n,uD)
echo Steady State Unavailability: expr exrss(garg) end
Figure 4.19: MRGP Code for Software Rejuvenation

prob(garg, 4): 1.04253802e-04
------------------------------------------prob(garg, 5): 3.46529744e-04
* Steady State Unavailability: ------------------------------------------exrss(garg): 5.78919127e-04
As we can see these two results match with each other fairly well.
68
Chapter 5 Conclusions and Future Work

5.1 Conclusions
In this thesis we studied Markov regenerative process (MRGP) and Markov regenerative stochastic Petri nets (MRSPN). We rst introduced the background knowledge of stochastic processes and stochastic Petri nets (SPN) in Chapter 2. We pointed out the limitations of traditional Markov and SPN models to motivate the development MRGP and MRSPN models. In Chapter 3 we presented a brief overview of Markov renewal/regenerative theory and concentrated on one type of Markov regenerative process, in which at most one generally distributed transition is enabled for each state. The subordinated stochastic process of this type of MRGP is a continuous-time Markov chain, which is readily solved by the existing routines in the SHARPE (Symbolic Hierarchical Automated Reliability and Performance Evaluator) package. To obtain the steady state solution of an MRGP model we need to calculate local kernel E(t) and the onestep transition probability matrix P of the embedded Markov chain (EMC). Several important theorems about E(t) and P computations are stated. In order to introduce the MRGP solver kernel, we outlined the structure of SHARPE and described the important modules in implementing the MRGP steady state solver. Chapter 4 was dedicated to MRGP applications. We demonstrated the necessity of MRGP models and the usage of our solver with the following examples: M/G/1 Breakdown Queue. This is the example used in Chapter 3 to illustrate MRSPN and MRGP. In spite 69
of its simplicity, a number of real system models fall into M/G/1 breakdown queue class, such as the case of an ATM switch processing xed-length ATM cells. Detection and Restoration Times in Communication Networks Error Recovery. In this example, we studied a 1:N protection switching system with deterministically distributed detection and restoration times. Closed-form steady state loss rates and SHARPE numerical results were compared. Parallel System with a Single Repairman. A parallel system consists of two identical components and there is a single repairman responsible to maintain it. The repair time is deterministically distributed, while the failure times of the components are exponentially distributed with constant rates. MRSPN model and the underlying MRGP was shown, and the closed-form availability was compared with the SHARPE output. Warm Spare with a Single Repairman. In this example we consider a system with a warm spare, where the failure rate of the spare unit is smaller than that of the active unit. Again there is only one repairperson and the repair time is deterministic. The system availability was studied and comparison of theoretical closed-form result and the SHARPE solution was given. Vacation Queue. A server is shared by three clients, each of which issues a job at a constant rate. The service time of a single job is xed. Every time the server completes a job, it takes a vacation for a denite amount of time. MRSPN and MRGP models were given and solved. Bulk Service System. 70
In this example, all the jobs in the queue are processed when the given threshold is reached, or the timer res or a high priority job arrives. The ring times of the clock are deterministically distributed. The MRSPN and MRGP models were studied, and the SHARPE solutions were compared with closed-form ones. Software Rejuvenation. To enhance the system availability, software rejuvenation is performed periodically with a xed time interval. The system may age at a constant rate and then can fail with another constant rate. The system failures and the rejuvenation will bring the system down for restart. The restart rates do not change with time. The MRSPN and MRGP were described and solved. The close-form solutions matched with the SHARPE numerical results well.
5.2
Future Work
Although we have successfully developed an MRGP steady state solver, integrated it into the SHARPE software package and applied it to several problems, many issues remain to to be addressed: 1. Deterministic Distribution Implementation. Deterministic distribution is a very important and frequently met general distribution. As described earlier, SHARPE supports builtin exponomial operations. Erlang distribution is suggested as an approximation to the deterministic distribution. Although for many cases Erlang is good enough, for analysis of critical systems, the accuracy is not always satisfactory, or the Erlang exponomials will have numerous terms, so that it will take a large amount of memory space and a signicantly long time to process. It would be very helpful and interesting to study the algorithms for deterministic distribution implementation in SHARPE. 71
2. Transient Analysis. Steady state solution is not good enough to characterize a systems behavior in many cases. For example, in reliability analysis the systems underlying MRGP is not irreducible, and hence it is not very useful to do the steadystate analysis. Cumulative distribution function (CDF) is desired to evaluate the systems reliability. Although the transient analysis has been studied and formalized in the theorems stated in [4, 9], algorithm design and implementation in SHARPE are not straightforward. 3. MRSPN. The users are required to specify the MRGP in a SHARPE input le, including all the MRGP states, Markov regenerative transitions and concurrent transitions. When the system becomes large, the number of states increases dramatically. It would be extremely hard, if not impossible, for the user to hand derive the MRGP from MRSPN or some other higher level specication. This problem gave rise to Stochastic Petri Nets Package (SPNP), which is a software tool to construct Petri nets, generate the reachability graph and nd the solution. We expect to integrate an MRSPN solver into SPNP, and make the generation of reachability graph automatic. 4. General MRSPN. The current MRSPN incorporated in SHARPE has several structural restrictions. For example, at most one general distribution can be enabled in each state, hence the subordinated MRGP of the underlying MRSPN is a CTMC which is solvable. In some applications, this requirement cannot be satised. It would be interesting and helpful to develop theories of a more general MRSPN and MRGP to handle these problems.
72
Bibliography
[1] A. Bobbio, V. G. Kulkarni, A. Puliato, M. Telek, and K. Trivedi. Preemptive repeat identical transitions in markov regenerative stochastic petri nets. In Proc. of Petri Net and Performance Models PNPM95, pages 113123, Durham, NC, 1995. [2] G. Bolch, S. Greiner, H. de Meer, and K. Trivedi. Queueing Networks and Markov Chains, chapter 13. John Wiley, New York, U.S.A., 1998. [3] H. Choi, V. G. Kulkarni, and K. S. Trivedi. Transient analysis of deterministic and stochastic Petri nets. In Proceedings of The 14th International Conference on Application and Theory of Petri Nets, Chicago, U.S.A., Jun. 21-25 1993. [4] H. Choi, V. G. Kulkarni, and K. S. Trivedi. Markov Regenerative Stochastic Petri Nets. Performance Evaluation, 20:335357, 1994. [5] Joanne Bechta Dugan, K. S. Trivedi, R. M. Geist, and V. F. Nicola. Extended stochastic Petri nets: Applications and analysis. In E. Gelenbe, editor, Performance 84, pages 507519, Amsterdam, 1984. Elsevier Science Publishers B. V. (North-Holland). [6] R. Fricks, M. Telek, A. Puliato, and K. Trivedi. Markov renewal theory applied to performability evaluation. In K. Bagchi and G. Zobrist, editors, State-ofthe Art in Performance Modeling and Simulation. Modeling and Simulation of Advanced Computer Systems: Applications and Systems, pages 193236. Gordon and Breach Publishers, Newark, NJ, 1998. [7] S. Garg, A. Puliato, M. Telek, and K. S. Trivedi. Analysis of software rejuvenation using markov regenerative stochastic petri nets. In Sixth Intl. Symposium on Software Reliability Engineering, pages 180187, Toulouse, France, 1995. [8] Alan George and Joseph W. Liu. Computer Solution of Large Sparse Positive Denitive Systems. Prentice-Hall, New Jersey, 1981. [9] R. German, D. Logothetis, and K. S. Trivedi. Transient analysis of Markov regenerative stochastic Petri nets: A comparison of approaches. In Proc. 6th IEEE Workshop on Petri Nets and Performance Models (PNPM95), pages 103 112, Durham, North Carolina, U.S.A., 1995. [10] David Kincaid and Ward Cheney. Numerical Analysis. Brooks/Cole Publishing Company, 1996. 73
[11] Leonard Kleinrock. Queueing Systems, volume 1. John Wiley and Sons, 1975. [12] V. Kulkarni. Modeling and Analysis of Stochastic Systems. Chapman-Hall, 1995. [13] C. Lindemann. Performance Modelling with Deterministic and Stochastic Petri Nets. John Wiley and Sons, 1998. [14] D. Logothetis and K. Trivedi. Time-dependent behavior of redundant systems with deterministic repair. In W. J. Stewart, editor, 2nd International Workshop on the Numerical Solution of Markov Chains, pages 135150. Kluwer Academic Publishers, 1995. [15] D. Logothetis and Kishor Trivedi. The eect of detection and restoration times for error recovery in communication networks. Journal of Network and Systems Management, 5(2):173195, June 1997. [16] Dimitris Logothetis. Transient Analysis of Communication Networks. PhD thesis, Duke University, 1994. [17] Varsha Arvind Mainkar. Solutions of Large and Non-Markovian Performance Models. PhD thesis, Duke University, 1994. [18] M. A. Marsan and G. Chiola. On Petri nets with deterministic and exponentially distributed ring times, volume 266, pages 132145. Springer-Verlag, 1987. [19] M. Ajmone Marsan, G. Balbo, and G. Conte. A class of generalized stochastic Petri nets for the performance evaluation of multiprocessor systems. ACM Transactions on Computer Systems, pages 93122, May 1984. [20] Michael K. Molloy. Performance analysis using stochastic Petri nets. IEEE Transactions on Computers, C-31(9):913917, September 1982. [21] C. Pozrikidis. Numerical Computation in Science and Engineering. Oxford University Press, 1998. [22] A. Puliato, M. Scarpa, and K. Trivedi. Petri nets with k simultaneously enabled generally distributed timed transitions. Performance Evaluation, 32(1):134, February 1998. [23] A. V. Ramesh and K. S. Trivedi. Semi-numerical transient analysis of markov models. In Proc. 33rd ACM Southeast Conference, Clemson, South Carolina, U.S.A., March, 1995. 74
[24] Yousef Saad. Iterative Methods for Sparse Linear Systems. PWS Publishing Company, New York, 1996. [25] R.A. Sahner, K.S. Trivedi, and A. Puliato. Performance and Reliability Analysis of Computer Systems: An Example-Based Approach Using the SHARPE Software Package. Kluwer Academic Publishers, 1995. [26] M. Telek, A. Bobbio, L. Jereb, and K. Trivedi. Steady state analysis of markov regenerative spn with age memory policy. In Proc. of the International Conference on Performance Tools and MMB 95, pages 165179, Heidelberg, Germany, 1995. [27] K.S. Trivedi. Probability and Statistics with Reliability, Queueing, and Computer Science Applications. Prentice-Hall, Englewood Clis, New Jersy, U.S.A., 1982. [28] Chang-Yu Wang. Non-Markovian Models for the Analysis of Computers and Networks. PhD thesis, Duke University, 1995.
75

10 1 1 9

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10 1 1 9

Uploaded by

Copyright:

Available Formats

MARKOV REGENERATIVE PROCESS IN SHARPE

Markov Regenerative Process . . . . . . . . . . . . . . . . . . . . . . SHARPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Implementation of MRGP Steady State Solver . . . . . . . . . . . . . 3.2.1 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . v

3.2.2 3.2.3 3.2.4

Data Structures in SHARPE . . . . . . . . . . . . . . . . . . . Input Part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solver Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Conclusions and Future Work 5.1 5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.1 3.2 3.3 3.4 3.5

Reliability, Availability and Performance Modeling

Markov Regenerative Process

Stochastic Processes and Markov Chains

for 0 t1 t2 . . . tn , the stochastic process is called an independent process. 4

Stochastic Petri Nets

Markov Regenerative Process with subordinated CTMC

Markov Regenerative Process with subordinated CTMC

Markov Regenerative Process with subordinated Markov Regenerative Process.

Chapter 3 Markov Regenerative Process and Steady State Solver

Markov Regenerative Process

Figure 3.1: A Typical Sample Path of a Renewal Process

Markov Renewal Process

N (t) = sup{n 0 : Sn t}.

Markov Regenerative Process

010 110 210 310

001 g 101 g g 301 201

Vuj (t x)dKiu (x)

(n, n ) if n = n , [Q(m)]nn = 0 if n (m). As we will see in

The local kernel E(t) = [Emn (t)] (m, n ) is given

THEOREM 2. if G(m) = , Kmn (t) =

The kernel K(t) = [Kmn (t)] (m, n ) is given by:

(m, n) (1 em t ) if m > 0, and 0 otherwise. m

dFg (x)(m , n);

3. if n E (m) and n G (m), Kmn (t) = [eQ(m)t ]mn (1 Fg (t)) + +

dFg (x)(m , n);

4. if n E (m) and n G (m), Kmn (t) = 0 .

The one-step transition probability matrix P =

(m, n) if m > 0, and 0 otherwise. m

2. if n E (m) but n G (m), Pmn =

dFg (x)(m , n);

3. if n E (m) and n G (m), Pmn = +

dFg (x)(m , n);

4. if n E (m) and n G (m), Kmn (t) = 0 .

Emn (t)dt; where

The limiting probability vector p = [pj ] of the state

probabilities of the MRGP is given by pj = lim P {Z(t) = j|Y0 = m} =

The theorem can be interpreted as below: pj =

The proof of this theorem can be found in [12].

Summary of Stochastic Processes

Regenerative Process Markov Renewal Process

Markov Renewal Sequence

Markov Regenerative Process

Implementation of MRGP Steady State Solver

Data Structures in SHARPE

echo Steady State Throughput: expr exrss(ttt) end

Figure 3.8: SHARPE MRGP Specication for M/G/1/3 Breakdown Queue

110 210 310 301

210 310 301

[eQ(m)x ]mn dFg (x)

Pmn = if n G (m) but n E (m); Pmn =

[eQ(m)x ]mm dFg (x)(m , n)

[eQ(m)x ]mm dFg (x)(m , n) +

[eQ(m)x ]mn dFg (x)

if n G (m) and n E (m); 35

Aij . The reasons are:

mrgp demitris S2 D2 R2 S1 @ @ @ @ D2 R2 S1 D1 exp(Nlambda) Erlang(n,uD) Erlang(n,uR) exp(Nlambda)