Distributed Deadlock

Distributed Deadlock Detection
17 April 2012
References:
Mukesh Singhal, Niranjan Shivaratri. Advanced concepts in
operating systems. Tata McGraw Hill publication.
17 April 2012
Introduction
Deadlock: a situation where a process is blocked, waiting on an event that will
never occur.
In Distributed Systems, a process can request and release resources in any order If the sequence of allocation is not controlled, deadlocks can occur Assumptions:
System has only reusable resources Only exclusive access to resources Only one copy of each resource
States of a process: running or blocked

Running state: process has all the resources Blocked state: waiting on one or more resource
3 17 April 2012
Introduction
Deadlock can arise if four conditions hold simultaneously. Mutual exclusion: only one process at a time can use a resource. Hold and wait: a process holding resource(s) is waiting to acquire additional
resources held by other processes.

No preemption: a resource can be released only voluntarily by the process
holding it upon its task completion.

Circular wait: there exists a set {P0, P1, , Pn} of waiting processes such
that P0 is waiting for a resource that is held by P1, P1 is waiting for a resource that is held by P2, , Pn1 is waiting for a resource that is held by Pn, and Pn is waiting for a resource that is held by P0.
17 April 2012
Types of Deadlocks
Resource Deadlocks
A process needs multiple resources for an activity. Deadlock occurs if each process in a set request resources held by another
process in the same set, and it must receive all the requested resources to move further.
Communication Deadlocks
Processes wait to communicate with other processes in a set. Each process in the set is waiting on another processs message, and no
process in the set initiates a message until it receives a message for which it is waiting.
17 April 2012
Graph Models
Resource allocation graph To represent the state of process-resource interaction in DS Nodes of a graph processes and resources. Edges of a graph pending requests or assignment of resources. A pending request is represented by a request edge directed from the node of a
requesting process to the node of the requested resource

A resource assignment is represented by an assignment edge directed from the
node of an assigned resource to the node of the assigned process.
17 April 2012
Graph Models
System is deadlocked if its resource allocation graph contains a directed cycle or
a knot
A knot in a directed graph is a collection of vertices and edges with the
property that every vertex in the knot has outgoing edges, and all outgoing edges from vertices in the knot terminate at other vertices in the knot. Thus it is impossible to leave the knot while following the directions of the edges.
17 April 2012
Resource Allocation Graph

Process
Resource Type with 4 instances
Pi requests instance of Rj
Pi Request edge
Rj
The sequence of Processs resource utilization
Pi is holding an instance of Rj
Pi
Assignment edge
Pi releases an instance of Rj
8
Pi
17 April 2012
Resource-allocation graph
Can a deadlock happen?

9 17 April 2012
Resource-allocation graph
Can a deadlock happen? There are two cycles found.
10
17 April 2012
Resource Allocation Graph With A Cycle But No Deadlock

If graph contains no cycles no
deadlock.
If graph contains a cycle if only one instance per
resource type, then deadlock. if several instances per resource type, possibility of deadlock.
11
17 April 2012
Graph Models
In DS, the system state can be modeled by a directed graph called a wait-for-
graph(WFG).
Wait-for Graphs (WFG): P1 -> P2 implies P1 is waiting for a resource from P2. WFG in databases Transaction-wait-for Graphs (TWF) Nodes are transactions and T1T2 indicates that T1 is blocked and
waiting for T2 to release some resource
12
17 April 2012
Graph Models
Wait-for Graphs (WFG): P1 -> P2 implies P1 is waiting for a resource from P2.
P1
R1
R2
P2
13
17 April 2012
AND, OR Models
AND Model A process/transaction can simultaneously request for multiple resources. Remains blocked until it is granted all of the requested resources. A cycle is sufficient to declare a deadlock with this model OR Model A process/transaction can simultaneously request for multiple resources. Remains blocked till any one of the requested resource is granted. A cycle is a necessary condition A knot is a sufficient condition
14
17 April 2012
AND, OR Models
AND Model Presence of a cycle.
P1 P2 P1
P1
P1
P1
15
17 April 2012
AND, OR Models
OR Model Presence of a knot.
P1 P2 P5
P4
P3
P6
16
17 April 2012
Deadlock Handling Strategies

There are three strategies for handling deadlocks, viz., deadlock prevention,
deadlock avoidance, and deadlock detection.

1.
Deadlock Prevention:
Resources are granted to requesting process in such a way that granting a
request for a resource never leads a deadlock

Can be achieved by either having process acquire all the needed resources
simultaneously before it begins execution, or by preempting a process that holds the needed resource
17
17 April 2012

Many drawbacks Decreases the system concurrency Set of processes can become deadlocked in resource acquiring phase In many systems, future resource requirements are unpredictable
Site S1
P1
Site S2
P2
Site S3
R3
Site S4
R4
18
17 April 2012

2.
Deadlock Avoidance:
Before allocation, check for possible deadlocks. A resource is granted to a process if the resulting global system state is safe. A state is safe if there exists at least one sequence of execution for all
processes such that all of them can run to completion

Difficult in DS because,
it needs global state info in each site requiring huge storage and communication
requirements The process of checking for a safe global state must be mutually exclusive Due to the large number of processes and resources, it will be computationally expensive to check for a safe state
19
17 April 2012

Deadlock Detection:
Resources are granted to requesting processes without any check Periodically the status of resource allocation and pending requests is
examined Find cycles in WFG

Two favorable conditions:
Once a cycle is formed in the WFG, it persists until it is detected and broken Cycle detection can proceed concurrently with the normal activities of a system
20
17 April 2012
Issues in Deadlock detection and resolution

Two basic issues : Detection of existing deadlocks Resolution of detected deadlocks Detection of deadlocks involves: Maintenance of the WFG Search of the WFG for the presence of cycles(or knots) Correct deadlock detection algorithm must satisfy the following two conditions: 1. ProgressNo undetected deadlocks Algorithm must detect all existing deadlocks in finite time
2.
SafetyNo false deadlocks

Algorithm should not report deadlocks which are non-existent
21
17 April 2012
Issues in Deadlock detection and resolution

Resolution
Breaking existing dependencies in the system WFG Roll back one or more processes and assign their resources to other
deadlocked processes
22
17 April 2012
Control organizations for Distributed Deadlocks

1.
Centralized Control
A control site constructs global wait-for graphs (WFGs) and checks for
directed cycles.
WFG can be maintained continuously (or) built on-demand by requesting
WFGs from individual sites.

Simple and easy to implement Have a single point of failure
Communication links near the control sites are likely to be congested
23
17 April 2012

2.
Distributed Control
Responsibility for detecting a global deadlock is shared equally among all
sites WFG is spread over different sites Deadlock detection is initiated only when the waiting process is suspected to be a part of deadlock cycle Any site can initiate the deadlock detection process
Not vulnerable to single point failure Difficult to design due to the lack of globally shared memory
Several sites may initiate detection for the same deadlock
24
17 April 2012

3.
Hierarchical Control
Sites are arranged in a hierarchy A site checks for cycles involving only its descendent sites Exploits access patterns local to a cluster of sites to efficiently detect
deadlocks Tend to get best of both the centralized and distributed control
25
17 April 2012
Centralized Algorithms
The completely centralized algorithm
Control site maintains the WFG of the entire system and checks it for the
existence of the deadlock Request resource and release resource messages sent by each site to the control site Simple and easy to implement
Highly inefficient Large delays in user response, large communication overhead, congestion of
communication links near the control site Poor reliability
26
17 April 2012
Solution:
Can be mitigated partially by having each site maintain its WFG locally and
transmitting it periodically Due to inherent communication delays and the lack of synchronized clocks, the designated site may get inconsistent view
Example:
R1 and R2 are stored at sites S1 and S2 Transactions T1 and T2 are started at sites S3 and S4 respectively:
27
17 April 2012
T1 lock R1 unlock R1 lock R2 unlock R2 Step1: R1<- T1 report status : T2->T1 Step 2: unlock R1 : R1<- T2 Step 3: R2<-T2 report Status : T1->T2 T2 lock R1 unlock R1 lock R2 unlock R2
T2 R1
T1
28
17 April 2012
Ho-Ramamoorthy 2-phase Algorithm
Each site maintains a status table of all processes initiated at that site: includes
all resources locked & all resources being waited on.

Controller requests (periodically) the status table from each site. Controller then constructs WFG from these tables, searches for cycle(s). If no cycles, no deadlocks.
29
17 April 2012
Otherwise, (cycle exists): Request for state tables again.
Construct WFG based only on common transactions in the 2 tables.

If the same cycle is detected again, system is in deadlock. Later proved: cycles in 2 consecutive reports need not result in a deadlock.
Hence, this algorithm detects false deadlocks
30
17 April 2012
Ho-Ramamoorthy 1-phase Algorithm
Each site maintains 2 status tables: resource status table and process status
table.
Resource table: transactions that have locked or are waiting for resources. Process table: resources locked by or waited on by transactions. Controller periodically collects these tables from each site. Constructs a WFG from transactions common to both the tables. No cycle, no deadlocks. A cycle means a deadlock.
31 17 April 2012
Does not detect false deadlocks:
E.g. Resource table at S1: R1 is waited upon by P2 (R1 P2) Process table at S2: P2 is waiting for R1 (P2R1) then, the edge P2R1 reflects current system state.
32
17 April 2012
Distributed Algorithms
All sites collectively cooperate to detect a cycle in the state graph that is likely to
be distributed over several sites of the system

Initiated when,
A process is forced to wait and, it can be initiated either by the local site of the process or by the site where the process
waits
Distributed Computing@M.Tech I SVNIT
4/17/2012
Classification of Distributed Algorithms

Path-pushing: resource dependency information disseminated through
designated paths (in the graph).

Edge-chasing: special messages or probes circulated along edges of WFG.
Deadlock exists if the probe is received back by the initiator.

Diffusion computation: queries on status sent to process in WFG.
Global state detection: get a snapshot of the distributed system.
4/17/2012
Edge-Chasing Algorithm
Chandy-Misra-Haass Algorithm:
A probe(i, j, k) is used by a deadlock detection process Pi. This probe is sent
by the home site of Pj to Pk. This probe message is circulated via the edges of the graph. Probe returning to Pi implies deadlock detection. Terms used:
Pj is dependent on Pk, if a sequence of Pj, Pi1,.., Pim, Pk exists. Pj is locally dependent on Pk, if above condition + Pj,Pk on same site. Each process maintains an array dependenti: dependenti(j) is true if Pi knows that Pj
is dependent on it. (initially set to false for all i & j).
4/17/2012
Chandy-Misra-Haass Algorithm
Sending the probe: if Pi is locally dependent on itself then deadlock. else for all Pj and Pk such that (a) Pi is locally dependent upon Pj, and (b) Pj is waiting on Pk, and (c ) Pj and Pk are on different sites, send probe(i,j,k) to the home site of Pk. Receiving the probe: if (d) Pk is blocked, and (e) dependentk(i) is false, and (f) Pk has not replied to all requests of Pj, then begin dependentk(i) := true; if k = i then Pi is deadlocked else ...
Distributed Computing@M.Tech I SVNIT 4/17/2012
Chandy-Misra-Haass Algorithm
Receiving the probe: . else for all Pm and Pn such that (a) Pk is locally dependent upon Pm, and (b) Pm is waiting on Pn, and (c) Pm and Pn are on different sites, send probe(i,m,n) to the home site of Pn. end.
4/17/2012
Example
P2
P1
P3
probe(1, 9, 1)
Site S1
probe(1, 3, 4)
P9 P10
P8
P4 probe(1, 6, 8) probe(1, 7, 10) P6 P7 Site S2 P5
Site S3
Classification of Distributed Algorithms

Path-pushing: resource dependency information disseminated through
designated paths (in the graph).

Edge-chasing: special messages or probes circulated along edges of WFG.
Deadlock exists if the probe is received back by the initiator.

Diffusion computation: queries on status sent to process in WFG.
Global state detection: get a snapshot of the distributed system.
4/17/2012
A Diffusion Computation Based Algorithm

Deadlock Detection computation is diffused through WFG of the system
Messages takes two forms Query message (i,j,k)

i = initiator of check j = immediate sender k = immediate recipient
Reply message (i,j,k)

Process can be in two states: Blocked state Active state Initiation of diffusion computation taken by blocked process
4/17/2012

Dependent set of process:
Set of processes from whom the process is waiting to receive message
Engaging query:
The first query message received for the deadlock detection initiated by Pi.
waitk(i):
indicates that it has been continuously blocked since it received the last engaging
query from process Pi
4/17/2012

Deadlock detection is initiated by a blocked process by sending query message
to all processes in its dependent set

For Blocked process, on receiving the query message,
If it is engaging query, it sends it to all processes in its dependent set, If not, then it sends reply message to the sender of query message Process should be continuously in the blocked stage since it has received engaging
query
A blocked process sends reply message in response to engaging query only after
it has received a reply to every query message it sent out for this engaging query.
4/17/2012
Diffusion-based Algorithm
Initiation by a blocked process Pi: send query(i,i,j) to all processes Pj in the dependent set DSi of Pi; num(i) := |DSi|; waiti(i) := true; Blocked process Pk receiving query(i,j,k): if this is engaging query for process Pk /* first query from Pi */ then send query(i,k,m) to all Pm in DSk; numk(i) := |DSk|; waitk(i) := true; else if waitk(i) then send a reply(i,k,j) to Pj.
Distributed Computing@M.Tech I SVNIT 4/17/2012
Diffusion-based Algorithm
Process Pk receiving reply(i,j,k)
if waitk(i) then numk(i) := numk(i) - 1; if numk(i) = 0 then if i = k then declare a deadlock. else send reply(i, k, m) to Pm, which sent the engaging query.
4/17/2012
On receipt of query(i,j,k) by m
if not blocked then discard the query if blocked if this is an engaging query propagate query(i,j,k) else
if not continuously blocked since engagement then discard the query else send reply(i,k,j) to j
The black dashed arrows indicate the engaging process for each engaged process. This information is needed to route the reply when the number of other processes for which the engaged process is waiting goes to zero.
Num(A)=2 Num(A)=2 Q(A,C,D)
D
Q(A,C,E)
A
Num(A)=1
F
Q(A,F,A) Num(A)=1
On receipt of reply(i,j,k) by k
if this is not the last reply
then just decrement the awaited reply count if this is the last reply then
if i=k report a deadlock else send reply(i,k,m)
to the engaging process m
Knot detected !
Hierarchical deadlock detection algorithms

Sites are arranged in hierarchy A site is responsible for detection deadlocks involving only its children sites Takes advantage of access patterns that are localized to a cluster of sites
51
4/17/2012
Hierarchical Deadlock Detection

Follows Ho-Ramamoorthys 1-phase algorithm. More than 1 control site organized in hierarchical manner. Each control site applies 1-phase algorithm to detect (intracluster) deadlocks. Central site collects info from control sites, applies 1-phase algorithm to detect intercluster deadlocks.
Control site Central Site Control site

Control site
4/17/2012
Hierarchical Deadlock Detection

Master Control Node Level 1 Control Node
Level 2 Control Node

Level 3 Control Node
Persistence & Resolution

Deadlock persistence:
Average time a deadlock exists before it is resolved.
Implication of persistence:
Resources unavailable for this period: affects utilization Processes wait for this period unproductively: affects response time.
Deadlock resolution:
Aborting at least one process/request involved in the deadlock. Efficient resolution of deadlock requires knowledge of all processes and resources. If every process detects a deadlock and tries to resolve it independently -> highly
inefficient ! Several processes might be aborted.
4/17/2012
Deadlock Resolution
Priorities for processes/transactions can be useful for resolution.
Highest priority process initiates and detects deadlock (initiations by lower priority
ones are suppressed). When deadlock is detected, lowest priority process(es) can be aborted to resolve the deadlock.
After identifying the processes/requests to be aborted,
All resources held by the victims must be released. State of released resources restored
to previous states. Released resources granted to deadlocked processes. All deadlock detection information concerning the victims must be removed at all the sites.
4/17/2012

Distributed Deadlock

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Deadlock

Uploaded by

Copyright:

Available Formats

Distributed Deadlock Detection

operating systems. Tata McGraw Hill publication.

States of a process: running or blocked

resources held by other processes.

holding it upon its task completion.

requesting process to the node of the requested resource

node of an assigned resource to the node of the assigned process.

Resource Allocation Graph

Resource Type with 4 instances

The sequence of Processs resource utilization

Can a deadlock happen?

Can a deadlock happen? There are two cycles found.

Resource Allocation Graph With A Cycle But No Deadlock

waiting for T2 to release some resource

Deadlock Handling Strategies

deadlock avoidance, and deadlock detection.

request for a resource never leads a deadlock

Deadlock Handling Strategies

Deadlock Handling Strategies

processes such that all of them can run to completion

Deadlock Handling Strategies

examined Find cycles in WFG

Issues in Deadlock detection and resolution

SafetyNo false deadlocks

Issues in Deadlock detection and resolution

Control organizations for Distributed Deadlocks

WFGs from individual sites.

Control organizations for Distributed Deadlocks

Several sites may initiate detection for the same deadlock

Control organizations for Distributed Deadlocks

communication links near the control site Poor reliability

all resources locked & all resources being waited on.

Construct WFG based only on common transactions in the 2 tables.

Hence, this algorithm detects false deadlocks

be distributed over several sites of the system

Distributed Computing@M.Tech I SVNIT

Classification of Distributed Algorithms

designated paths (in the graph).

Deadlock exists if the probe is received back by the initiator.

Distributed Computing@M.Tech I SVNIT

A probe(i, j, k) is used by a deadlock detection process Pi. This probe is sent

is dependent on it. (initially set to false for all i & j).

Distributed Computing@M.Tech I SVNIT

Distributed Computing@M.Tech I SVNIT

P4 probe(1, 6, 8) probe(1, 7, 10) P6 P7 Site S2 P5

Classification of Distributed Algorithms

designated paths (in the graph).

Deadlock exists if the probe is received back by the initiator.

Distributed Computing@M.Tech I SVNIT

A Diffusion Computation Based Algorithm

Messages takes two forms Query message (i,j,k)

Reply message (i,j,k)

Distributed Computing@M.Tech I SVNIT

A Diffusion Computation Based Algorithm

query from process Pi

Distributed Computing@M.Tech I SVNIT

A Diffusion Computation Based Algorithm

to all processes in its dependent set

Distributed Computing@M.Tech I SVNIT

Distributed Computing@M.Tech I SVNIT

to the engaging process m

Hierarchical deadlock detection algorithms

Distributed Computing@M.Tech I SVNIT