Professional Documents
Culture Documents
LCK0
Cache GRD Master
GES
GCS
LMON
LMD0
LMSx
DIAG
Cache
LCK0
GRD Master
GES
GCS
Node1
Instance1
Noden
Instancen
Cluster
Interconnect
Global
resources
Global Enqueue Services (GES) Global Cache Services (GCS)
Global Resource Directory (GRD)
O
r
a
c
l
e
U
n
i
v
e
r
s
i
t
y
a
n
d
M
a
z
z
S
o
l
u
c
i
o
n
e
s
S
R
L
u
s
e
o
n
l
y
T
H
E
S
E
e
K
I
T
M
A
T
E
R
I
A
L
S
A
R
E
F
O
R
Y
O
U
R
U
S
E
I
N
T
H
I
S
C
L
A
S
S
R
O
O
M
O
N
L
Y
.
C
O
P
Y
I
N
G
e
K
I
T
M
A
T
E
R
I
A
L
S
F
R
O
M
T
H
I
S
C
O
M
P
U
T
E
R
I
S
S
T
R
I
C
T
L
Y
P
R
O
H
I
B
I
T
E
D
Oracle Grid Infrastructure 11g: Manage Clusterware and ASM D - 12
The scenario described in the slide assumes that the data block has been changed, or dirtied,
by the first instance. Furthermore, only one copy of the block exists clusterwide, and the
content of the block is represented by its SCN.
1. The second instance attempting to modify the block submits a request to the GCS.
2. The GCS transmits the request to the holder. In this case, the first instance is the holder.
3. The first instance receives the message and sends the block to the second instance.
The first instance retains the dirty buffer for recovery purposes. This dirty image of the
block is also called a past image of the block. A past image block cannot be modified
further.
4. On receipt of the block, the second instance informs the GCS that it holds the block.
Note: The data block is not written to disk before the resource is granted to the second
instance.
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Global Cache Coordination: Example
Node1
Instance1
Node2
Instance2
Cache
Cluster
1009
1008
1 2
3
GCS
4
No disk I/O
LMON
LMD0
LMSx
LCK0
Cache
1009
DIAG
LMON
LMD0
LMSx
LCK0
DIAG
Block mastered
by instance 1
Which instance
masters the block?
Instance 2 has
the current version of the block.
O
r
a
c
l
e
U
n
i
v
e
r
s
i
t
y
a
n
d
M
a
z
z
S
o
l
u
c
i
o
n
e
s
S
R
L
u
s
e
o
n
l
y
T
H
E
S
E
e
K
I
T
M
A
T
E
R
I
A
L
S
A
R
E
F
O
R
Y
O
U
R
U
S
E
I
N
T
H
I
S
C
L
A
S
S
R
O
O
M
O
N
L
Y
.
C
O
P
Y
I
N
G
e
K
I
T
M
A
T
E
R
I
A
L
S
F
R
O
M
T
H
I
S
C
O
M
P
U
T
E
R
I
S
S
T
R
I
C
T
L
Y
P
R
O
H
I
B
I
T
E
D
Oracle Grid Infrastructure 11g: Manage Clusterware and ASM D - 13
The scenario described in the slide illustrates how an instance can perform a checkpoint at
any time or replace buffers in the cache as a response to free buffer requests. Because
multiple versions of the same data block with different changes can exist in the caches of
instances in the cluster, a write protocol managed by the GCS ensures that only the most
current version of the data is written to disk. It must also ensure that all previous versions are
purged from the other caches. A write request for a data block can originate in any instance
that has the current or past image of the block. In this scenario, assume that the first instance
holding a past image buffer requests that the Oracle server write the buffer to disk:
1. The first instance sends a write request to the GCS.
2. The GCS forwards the request to the second instance, which is the holder of the current
version of the block.
3. The second instance receives the write request and writes the block to disk.
4. The second instance records the completion of the write operation with the GCS.
5. After receipt of the notification, the GCS orders all past image holders to discard their
past images. These past images are no longer needed for recovery.
Note: In this case, only one I/O is performed to write the most current version of the block to
disk.
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Write to Disk Coordination: Example
Node1
Instance1
Node2
Instance2
Cache
Cluster
1010
1010
1
3
2
GCS
4 5
Only one
disk I/O
LMON
LMD0
LMSx
LCK0
DIAG
LMON
LMD0
LMSx
LCK0
DIAG
Cache
1009
Need to make room
in my cache.
Who has the current version
of that block?
Instance 2 owns it.
Instance 2, flush the block
to disk.
Block flushed, make room
O
r
a
c
l
e
U
n
i
v
e
r
s
i
t
y
a
n
d
M
a
z
z
S
o
l
u
c
i
o
n
e
s
S
R
L
u
s
e
o
n
l
y
T
H
E
S
E
e
K
I
T
M
A
T
E
R
I
A
L
S
A
R
E
F
O
R
Y
O
U
R
U
S
E
I
N
T
H
I
S
C
L
A
S
S
R
O
O
M
O
N
L
Y
.
C
O
P
Y
I
N
G
e
K
I
T
M
A
T
E
R
I
A
L
S
F
R
O
M
T
H
I
S
C
O
M
P
U
T
E
R
I
S
S
T
R
I
C
T
L
Y
P
R
O
H
I
B
I
T
E
D
Oracle Grid Infrastructure 11g: Manage Clusterware and ASM D - 14
When one instance departs the cluster, the GRD portion of that instance needs to be
redistributed to the surviving nodes. Similarly, when a new instance enters the cluster, the
GRD portions of the existing instances must be redistributed to create the GRD portion of the
new instance.
Instead of remastering all resources across all nodes, RAC uses an algorithm called lazy
remastering to remaster only a minimal number of resources during reconfiguration. This is
illustrated in the slide. For each instance, a subset of the GRD being mastered is shown along
with the names of the instances to which the resources are currently granted. When the
second instance fails, its resources are remastered on the surviving instances. As the
resources are remastered, they are cleared of any reference to the failed instance.
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Dynamic Reconfiguration
Node1
Instance1
masters
R1
granted
R2 1, 3
1, 2, 3
Node2
Instance2
masters
R3
granted
R4 1, 2
2, 3
Node3
Instance3
masters
R5
granted
R6 1, 2, 3
2
Node1
Instance1
masters
R1
granted
R2 1, 3
1, 3
Node2
Instance2
masters
R3
granted
R4 1, 2
2, 3
Node3
Instance3
masters
R5
granted
R6 1, 3
R3 3 R4 1
Reconfiguration remastering
O
r
a
c
l
e
U
n
i
v
e
r
s
i
t
y
a
n
d
M
a
z
z
S
o
l
u
c
i
o
n
e
s
S
R
L
u
s
e
o
n
l
y
T
H
E
S
E
e
K
I
T
M
A
T
E
R
I
A
L
S
A
R
E
F
O
R
Y
O
U
R
U
S
E
I
N
T
H
I
S
C
L
A
S
S
R
O
O
M
O
N
L
Y
.
C
O
P
Y
I
N
G
e
K
I
T
M
A
T
E
R
I
A
L
S
F
R
O
M
T
H
I
S
C
O
M
P
U
T
E
R
I
S
S
T
R
I
C
T
L
Y
P
R
O
H
I
B
I
T
E
D
Oracle Grid Infrastructure 11g: Manage Clusterware and ASM D - 15
In addition to dynamic resource reconfiguration, the GCS, which is tightly integrated with the
buffer cache, enables the database to automatically adapt and migrate resources in the GRD.
This is called dynamic remastering. The basic idea is to master a buffer cache resource on
the instance where it is mostly accessed. In order to determine whether dynamic remastering
is necessary, the GCS essentially keeps track of the number of GCS requests on a per-
instance and per-object basis. This means that if an instance, compared to another, is heavily
accessing blocks from the same object, the GCS can take the decision to dynamically migrate
all of that objects resources to the instance that is accessing the object most.
The upper part of the graphic shows you the situation where the same object has master
resources spread over different instances. In that case, each time an instance needs to read a
block from that object whose master is on the other instance, the reading instance must send
a message to the resources master to ask permission to use the block.
The lower part of the graphic shows you the situation after dynamic remastering occurred. In
this case, blocks from the object have affinity to the reading instance, which no longer needs
to send GCS messages across the interconnect to ask for access permissions.
Note: The system automatically moves mastership of undo segment objects to the instance
that owns the undo segments.
Copyright 2012, Oracle and/or its affiliates. All rights reserved.
Object Affinity and Dynamic Remastering
Node1 Node2
Instance2
Instance1
Object
Read from
disk
GCS message to master
Messages are sent to remote node when reading into cache.
Node1
Node2
Instance2
Instance1