You are on page 1of 20

Oracle DBA Interview Questions and Answers RAC

Oracle RAC Interview Questions and Answers


How does OCSSD starts first if voting disk & OCR resides in ASM
Diskgroups?
You might wonder how CSSD, which is required to start the clustered ASM
instance, can be started if voting disks are stored in ASM?
This sounds like a chicken-and-egg problem:
without access to the voting disks there is no CSS, hence the node
cannot join the cluster.
But without being part of the cluster, CSSD cannot start the ASM
instance.
To solve this problem the ASM disk headers have new metadata in 11.2:
you can use kfed to read the header of an ASM disk containing a voting
disk.
The kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the
voting file. This does not require the ASM instance to be up.
Once the voting disks are located, CSS can access them and joins the
cluster.
What is gsdctl in RAC? list gsdctl commands in Oracle RAC?
GSDCTL stands for Global Service Daemon Control, we can use gsdctl
commands to start, stop, and obtain the status of the GSD service on
any platform.
The options for gsdctl are:$ gsdctl start -- To start the GSD service
$ gsdctl stop -- To stop the GSD service
$ gsdctl stat -- To obtain the status of the GSD service
Log file location for gsdctl:
$ ORACLE_HOME/srvm/log/gsdaemon_node_name.log
What is RAC?
RAC stands for Real Application cluster.
It is a clustering solution from Oracle Corporation that ensures high
availability of databases by providing instance failover, media
failover features.
Oracle RAC is a cluster database with a shared cache architecture that
overcomes the limitations of traditional shared-nothing and shared-disk
approaches to provide a highly scalable and available database solution
for all the business applications.
Oracle RAC provides the foundation for enterprise grid computing.
What is Oracle RAC One Node?

Oracle RAC one Node is a single instance running on one node of the
cluster while the 2nd node is in cold standby mode. If the instance
fails for some reason then RAC one node detect it and restart the
instance on the same node or the instance is relocate to the 2nd node
incase there is failure or fault in 1st node. The benefit of this
feature is that it provides a cold failover solution and it automates
the instance relocation without any downtime and does not need a manual
intervention. Oracle introduced this feature with the release of 11gR2
(available with Enterprise Edition).
What are the advantages of RAC (Real Application Clusters)?
Reliability - if one node fails, the database won't fail.
Availability - nodes can be added or replaced without having to shut
down the database.
Scalability - more nodes can be added to the cluster as the workload
increases
What is Oracle RAC One Node?
Oracle RAC one Node is a single instance running on one node of the
cluster while the 2nd node is in cold standby mode. If the instance
fails for some reason then RAC one node detect it and restart the
instance on the same node or the instance is relocate to the 2nd node
incase there is failure or fault in 1st node. The benefit of this
feature is that it provides a cold failover solution and it automates
the instance relocation without any downtime and does not need a manual
intervention. Oracle introduced this feature with the release of 11gR2
(available with Enterprise Edition).
What is Cache Fusion?
Oracle RAC is composed of two or more instances. When a block of data
is read from datafile by an instance within the cluster and another
instance is in need of the same block, it is easy to get the block
image from the instance which has the block in its SGA rather than
reading from the disk. To enable inter instance communication Oracle
RAC makes use of interconnects. The Global Enqueue Service (GES)
monitors and Instance enqueue process manages the cache fusion.
What command would you use to check the availability of the RAC system?
crs_stat -t -v (-t -v are optional) Until 11.1
OR
crsctl check cluster all
11.2
How do we verify that RAC instances are running?
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER
column,host_:instancename under INST_NAME column.

How can you connect to a specific node in a RAC environment?


tnsnames.ora ensure that you have INSTANCE_NAME specified in it.
Which is the "MASTER NODE" in RAC?
The node with the lowest node number will become master node and
dynamic remastering of the resources will take place.
To find out the master node for particular resource, you can query
v$ges_resource for MASTER_NODE column.
To find out which is the master node, you can see ocssd.log file and
search for "master node number".
when the first master node fails in the cluster the lowest node number
will become master node.
What components in RAC must reside in shared storage? (Hint SC is DeaR)
SPFIles, controlfiles, datafils and redo log files must reside on
cluster-aware shred storage.
Give few examples for solutions that support cluster storage?
ASM (automatic storage management),
Raw disk devices,
Network file system (NFS),
OCFS2 and
OCFS (Oracle Cluster Fie systems).
What are Oracle Cluster Components?
1.Cluster Interconnect (HAIP)
2.Shared Storage (OCR/Voting Disk)
3.Clusterware software
4.Oracle Kernel Components
What are Oracle RAC Components?
VIP, Node apps etc.
What are Oracle Kernel Components?
Basically Oracle kernel need to switched on with RAC On option when you
convert to RAC, that is the difference as it facilitates few RAC bg
process like LMON,LCK,LMD,LMS etc.
How to turn on RAC?
# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle
Disk architechture in RAC?

SAN (Storage Area Networks) - generally using fibre to connect to the


SAN
NAS (Network Attached Storage) - generally using a network to connect
to the NAS using either NFS, ISCSI
What is Oracle Clusterware?
The Clusterware software allows nodes to communicate with each other
and forms the cluster that makes the nodes work as a single logical
server.
The software is run by the Cluster Ready Services (CRS) using the
Oracle Cluster registry(OCR) that records and maintains the cluster and
node membership information and the voting disk which acts as a
tiebreaker during communication failures. Consistent heartbeat
information travels across the interconnect to the voting disk when the
cluster is running.
Real Application Clusters
Oracle RAC is a cluster database with a shared cache architecture that
overcomes the limitations of traditional shared-nothing and shared-disk
approaches to provide a highly scalable and available database solution
for all your business applications. Oracle RAC provides the foundation
for enterprise grid computing.
Oracles Real Application Clusters (RAC) option supports the
transparent deployment of a single database across a cluster of
servers, providing fault tolerance from hardware failures or planned
outages. Oracle RAC running on clusters provides Oracles highest level
of capability in terms of availability, scalability, and low-cost
computing.
One DB opened by multipe instances so the the db ll be Highly Available
if an instance crashes.
Cluster Software. Oracles Clusterware or products like Veritas Volume
Manager are required to provide the cluster support and allow each node
to know which nodes belong to the cluster and are available and
with Oracle Cluterware to know which nodes have failed and to eject
then from the cluster, so that errors on that node can be cleared.
What are the Oracle Clusterware key components?
Oracle Clusterware has two key components Cluster Registry OCR and
Voting Disk.
What is Voting Disk and OCR?
Voting Disk
Oracle RAC uses the voting disk to manage cluster membership by way of
a health check and arbitrates cluster ownership among the instances in
case of network failures. The voting disk must reside on shared disk.

A node must be able to access more than half of the voting disks at any
time.
For example, if you have 3 voting disks configured, then a node must be
able to access at least two of the voting disks at any time. If a node
cannot access the minimum required number of voting disks it is
evicted, or removed, from the cluster.
Oracle Cluster Registry (OCR) NISA
The cluster registry holds all information about nodes, instances,
services and ASM storage if used, it also contains state information ie
they are available and up or similar.
The OCR must reside on shared disk that is accessible by all of the
nodes in your cluster.
What are the administrative tasks involved with voting disk?
Following administrative tasks are performed with the voting disk :
1) Backing up voting disks
2) Recovering Voting disks
3) Adding voting disks
4) Deleting voting disks
5) Moving voting disks
Can you add voting disk online? Do you need voting disk backup?
Yes, as per documentation, if you have multiple voting disk you can
add online, but if you have only one voting disk , by that cluster will
be down as its lost you just need to start crs in exclusive mode and
add the votedisk using
crsctl add votedisk <path>
What is the Oracle Recommendation for backing up voting disk?
Oracle recommends us to use the dd command to backup the voting disk
with a minimum block size of 4KB.
How do we backup voting disks?
1) Oracle recommends that you back up your voting disk after the
initial cluster creation and after we complete any node addition or
deletion procedures.
2) First, as root user, stop Oracle Clusterware (with the crsctl stop
crs command) on all nodes. Then, determine the current voting disk by
issuing the following command:
crsctl query votedisk css
3) Then, issue the dd or ocopy command to back up a voting disk, as
appropriate.
Give the syntax of backing up voting disks:On Linux or UNIX systems:
dd if=voting_disk_name of=backup_file_name
where,

voting_disk_name is the name of the active voting disk


backup_file_name is the name of the file to which we want to back up
the voting disk contents
On Windows systems, use the ocopy command:
copy voting_disk_name backup_file_name
How do we verify an existing current backup of OCR?
We can verify the current backup of OCR using the following command :
ocrconfig -showbackup
You have lost OCR disk, what is your next step?
From 11gR2 onwards, the crsd stack will be down, the hasd still up and
running. You can add the ocr back by restoring the automatic backup or
import the manual backup
In 10g It is true that the cluster stack will be down due to the fact
that cssd is unable to maintain the integrity.

What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in
the cluster and hence the processing differs.The most common wait
events related to this are gc cr request and gc buffer busy
GC CR request :the time it takes to retrieve the data from the remote
cache
Reason: RAC Traffic Using Slow Connection or Inefficient queries
(poorly tuned queries will increase the amount of data blocks requested
by an Oracle session. The more blocks requested typically means the
more often a block will need to be read from a remote instance via the
interconnect.)
GC BUFFER BUSY: It is the time the remote instance locally spends
accessing the requested data block.
What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR
Report?
This is most likely due to a fault in interconnect network.
Check netstat -s
if you see "fragments dropped" or "packet reassemblies failed" , Work
with your system administrator find the fault with network.
How do you troubleshoot node reboot?
Please check metalink ...
Note 265769.1 Troubleshooting CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information
for diagnosing Oracle Clusterware Node evictions.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS0215, however sqlplus can start it on both nodes? How do you identify
the problem?
Set the environmental variable SRVM_TRACE to true.. And start the
instance with srvctl. Now you will get detailed error stack.
What are Oracle database background processes specific to RAC?
Oracle RAC is composed of two or more database instances. They are
composed of Memory structures and background processes same as the
single instance database.Oracle RAC instances use two processes
GES(Global Enqueue Service), GCS(Global Cache Service) that enable
cache fusion.Oracle RAC instances are composed of following background
processes:
ACMSAtomic Controlfile to Memory Service (ACMS)
GTX0-jGlobal Transaction Process
LMONGlobal Enqueue Service Monitor
LMDGlobal Enqueue Service Daemon
LMSGlobal Cache Service Process
LCK0Instance Enqueue Process
RMSnOracle RAC Management Processes (RMSn)
RSMNRemote Slave Monitor
To ensure that each Oracle RAC database instance obtains the block that
it needs to satisfy a query or transaction, Oracle RAC instances use
two processes, the Global Cache Service (GCS) and the Global Enqueue
Service (GES). The GCS and GES maintain records of the statuses of each
data file and each cached block using a Global Resource Directory
(GRD). The GRD contents are distributed across all of the active
instances.
What is GRD? Cash for Data
GRD stands for Global Resource Directory. The GES and GCS maintain
records of the status of each datafile and each cahed block using
global resource directory. This process is referred to as cache fusion
and helps in data integrity.
What is ACMS?
ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC
environment ACMS is an agent that ensures a distributed SGA memory
update(ie)SGA updates are globally committed on success or globally
aborted in event of a failure.
What is SCAN listener?
A scan listener is something that additional to node listener which
listens the incoming db connection requests from the client which got
through the scan IP, it got end points configured to node listener
where it routes the db connection requests to particular node listener.

SCAN IP can be disabled if not required. However SCAN IP is mandatory


during the RAC installation. Enabling/disabling SCAN IP is mostly used
in oracle apps environment by the concurrent manager (kind of job
scheduler in oracle apps).
Steps to disable the SCAN IP,
i. Do not use SCAN IP at the client end.
ii. Stop scan listener
srvctl stop scan_listener
iii.Stop scan
srvctl stop scan (this will stop the scan vip's)
iv. Disable scan and disable scan listener
srvctl disable scan
What is an interconnect network?
An interconnect network is a private network that connects all of the
servers in a cluster. The interconnect network uses a switch/multiple
switches that only the nodes in the cluster can access.
What is the use of cluster interconnect?
Cluster interconnect is used by the Cache fusion for inter instance
communication.
How can we configure the cluster interconnect?
Configure User Datagram Protocol (UDP) on Gigabit Ethernet for
cluster interconnects.
On UNIX and Linux systems we use UDP and RDS (Reliable data socket)
protocols to be used by Oracle Clusterware.
Windows clusters use the TCP protocol.
What is the purpose of Private Interconnect?
Clusterware uses the private interconnect for cluster synchronization
(network heartbeat) and daemon communication between the the clustered
nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process
communication (TCP). Cache Fusion is the remote memory mapping of
Oracle buffers, shared between the caches of participating nodes in the
cluster.
What is a virtual IP address or VIP?
A virtual IP address or VIP is an alternate IP address that the client
connections use instead of the standard public IP address. To configure
VIP address, we need to reserve a spare IP address for each node, and
the IP addresses must use the same subnet as the public network.
What is the use of VIP?
If a node fails, then the node's VIP address fails over to another node

on which the VIP address can accept TCP connections but it cannot
accept Oracle connections.
Why do we have a Virtual IP (VIP) in Oracle RAC?
Without using VIPs or FAN, clients connected to a node that died will
often wait for a TCP timeout period (which can be up to 10 min) before
getting an error. As a result, you don't really have a good HA solution
without using VIPs.
When a node fails, the VIP associated with it is automatically failed
over to some other node and new node re-arps the world indicating a new
MAC address for the IP. Subsequent packets sent to the VIP go to the
new node, which will send error RST packets back to the clients. This
results in the clients getting errors immediately.
Give situations under which VIP address failover happens?
VIP addresses failover happens when the node on which the VIP address
runs fails; all interfaces for the VIP address fails, all interfaces
for the VIP address are disconnected from the network.
What is the significance of VIP address failover?
When a VIP address failover happens, Clients that attempt to connect to
the VIP address receive a rapid connection refused error .They don't
have to wait for TCP connection timeout messages.
What is the use of a service in Oracle RAC environment?
Applications should use the services feature to connect to the Oracle
database. Services enable us to define rules and characteristics to
control how users and applications connect to database instances.
What are the characteristics controlled by Oracle services feature?
The characteristics include a unique name, workload balancing, failover
options, and high availability.
What enables the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application
connections across all of the instances in an Oracle RAC database.
What are the types of connection load-balancing?
Connection Workload management is one of the key aspects when you have
RAC instances as you want to distribute the connections to specific
nodes/instance or those have less load.
There are two types of connection load-balancing:
1.Client Side load balancing (also called as connect time load
balancing)
2.Server side load balancing (also called as Listener connection load
balancing)

What is the difference between server-side and client-side connection


load balancing?
Client-side balancing happens at client side where load balancing is
done using listener. In case of server-side load balancing listener
uses a load-balancing advisory to redirect connections to the instance
providing best service.
Client Side load balancing:- Oracle client side load balancing feature
enables clients to randomize the connection requests among all the
available listeners based on their load.
A tnsnames.ora entry that contains all nodes entries and use
load_balance=on (default its on) will use the connect time load
balancing or client side load balancing.
Sample Client Side TNS Entry:finance =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac3-vip)(PORT = 2042))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = FINANCE) (FAILOVER=ON)
(FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180)
(DELAY = 5))
)
)
Server side load balancing:- This improves the connection performance
by balancing the number of active connections among multiple instances
and dispatchers. In a single instance environment (shared servers), the
listener selects the least dispatcher to handle the incoming client
requests. In a rac environments, PMON is aware of all instances load
and dispatchers , and depending on the load information PMON redirects
the connection to the least loaded node.
In a RAC environment, *.remote_listener parameter which is a
tnsnames.ora entry containing all nodes addresses need to set to enable
the load balance advisory updates to PMON.
Sample Tns entry should be in an instances of RAC cluster,
local_listener=LISTENER_MYRAC1
remote_listener = LISTENERS_MYRACDB

What are the administrative tools used for Oracle RAC environments?
Oracle RAC cluster can be administered as a single image using the
below

OEM (Enterprise Manager),

SQL*PLUS,

Server control (SRVCTL),

Cluster Verification Utility (CLUVFY),

DBCA,

NETCA
Name some Oracle Clusterware tools and their uses?
OIFCFG - allocating and deallocating network interfaces.
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry.
OCRDUMP - Identify the interconnect being used.
CVU - Cluster verification utility to get status of CRS resources.
What is the difference between CRSCTL and SRVCTL?
crsctl manages clusterware-related operations:
Starting and stopping Oracle Clusterware
Enabling and disabling Oracle Clusterware daemons
Registering cluster resources

SRVCTL: (Server Control utility) manages Oracle resourcerelated


operations: Sarvan (iti) IDS5
srvctl command target [options]
commands: enable|disable|start|stop|relocate|status|add|remove|modify|
getenv|setenv|unsetenv|config
targets:(IDS5) instance|database|service/serv|scan|scan_listener |
srvpool|server|VIP|nodeapps|asm|listener |diskgroup|home|ons|eons|
filesystem|gns|oc4j| -- From Oracle 11g R2
How do we remove ASM from a Oracle RAC environment?
We need to stop and delete the instance in the node first in
interactive or silent mode.After that asm can be removed using srvctl
tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name
How do we verify that an instance has been removed from OCR after
deleting an instance?
Issue the following srvctl command:

srvctl config database -d database_name


cd CRS_HOME/bin
./crs_stat
What are the modes of deleting instances from ORacle Real Application
cluster Databases?
We can delete instances using silent mode or interactive mode using
DBCA(Database Configuration Assistant).
What are the background process that exists in 11gr2 and functionality?
Process Name
Functionality
crsd
The CRS daemon (crsd) manages cluster resources based on
configuration information that is stored in Oracle Cluster Registry
(OCR) for each resource. This includes start, stop, monitor, and
failover operations. The crsd process generates events when the status
of a resource changes.
cssd
Cluster Synchronization Service (CSS): Manages the cluster
configuration by controlling which nodes are members of the cluster and
by notifying members when a node joins or leaves the cluster. If you
are using certified third-party clusterware, then CSS processes
interfaces with your clusterware to manage node membership information.
CSS has three separate processes: the CSS daemon (ocssd), the CSS Agent
(cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent process
monitors the cluster and provides input/output fencing. This service
formerly was provided by Oracle Process Monitor daemon (oprocd), also
known as OraFenceService on Windows. A cssdagent failure results in
Oracle Clusterware restarting the node.
diskmon
Disk Monitor daemon (diskmon): Monitors and performs
input/output fencing for Oracle Exadata Storage Server. As Exadata
storage can be added to any Oracle RAC node at any point in time, the
diskmon daemon is always started when ocssd is started.
evmd
Event Manager (EVM): Is a background process that publishes
Oracle Clusterware events
mdnsd
Multicast domain name service (mDNS): Allows DNS requests.
The mDNS process is a background process on Linux and UNIX, and a
service on Windows.
gnsd
Oracle Grid Naming Service (GNS): Is a gateway between the
cluster mDNS and external DNS servers. The GNS process performs name
resolution within the cluster.
ons
Oracle Notification Service (ONS): Is a publish-and-subscribe
service for communicating Fast Application Notification (FAN) events
oraagent
oraagent: Extends clusterware to support Oracle-specific
requirements and complex resources. It runs server callout scripts when
FAN events occur. This process was known as RACG in Oracle Clusterware
11g Release 1 (11.1).
orarootagent
Oracle root agent (orarootagent): Is a specialized
oraagent process that helps CRSD manage resources owned by root, such

as the network, and the Grid virtual IP address


oclskd
Cluster kill daemon (oclskd): Handles instance/node
evictions requests that have been escalated to CSS
gipcd
Grid IPC daemon (gipcd): Is a helper daemon for the
communications infrastructure
ctssd
Cluster time synchronisation daemon(ctssd) to manage the
time syncrhonization between nodes, rather depending on NTP
Under which user or owner the process will start?
Component
Name of the Process
Owner
Oracle High Availability Service
ohasd
init,
root
Cluster Ready Service (CRS)
Cluster Ready Services
root
Cluster Synchronization Service (CSS)
ocssd,cssd monitor,
cssdagent
grid owner
Event Manager (EVM)
evmd, evmlogger
grid owner
Cluster Time Synchronization Service (CTSS)
octssd
root
Oracle Notification Service (ONS)
ons, eons
grid
owner
Oracle Agent
oragent
grid owner
Oracle Root Agent
orarootagent
root
Grid Naming Service (GNS)
gnsd
root
Grid Plug and Play (GPnP)
gpnpd
grid owner
Multicast domain name service (mDNS)
mdnsd
grid
owner
What is the major difference between 10g and 11g RAC?
There is not much difference between 10g and 11gR (1) RAC. But there is
a significant difference in 11gR2.
Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS
Databases
Instances
Applications
Node Monitoring
Event Services
High Availability
From 11gR2(onwards) its completed HA stack managing and providing the
following resources as like the other cluster software like VCS etc.
Databases
Instances
Applications
Cluster Management
Node Management

Event Services
High Availability
Network Management (provides DNS/GNS/MDNSD services on behalf of
other traditional services) and SCAN Single Access Client Naming
method, HAIP
Storage Management (with help of ASM and other new ACFS filesystem)
Time synchronization (rather depending upon traditional NTP)
Removed OS dependent hang checker etc, manages with own additional
monitor process
What is hangcheck timer?
The hangcheck timer checks regularly the health of the system. If the
system hangs or stop the node will be restarted automatically.
There are 2 key parameters for this module:
-> hangcheck-tick: this parameter defines the period of time between
checks of system health. The default value is 60 seconds; Oracle
recommends setting it to 30seconds.
-> hangcheck-margin: this defines the maximum hang delay that should be
tolerated before hangcheck-timer resets the RAC node.
State the initialization parameters that must have
instance in an Oracle RAC database?
Some initialization parameters are critical at the
time and must have same values.Their value must be
or PFILE for every instance.The list of parameters
identical on every instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_passWORD_FILE
UNDO_MANAGEMENT

same value for every


database creation
specified in SPFILE
that must be

Advantages of RAC (Real Application Clusters)


Reliability - if one node fails, the database won't fail

Availability - nodes can be added or replaced without having to


shutdown the database
Scalability - more nodes can be added to the cluster as the workload
increases

What is a virtual IP address or VIP?


A virtual IP address or VIP is an alternate IP address that the client
connections use instead of the standard public IP address. To configure
VIP address, we need to reserve a spare IP address for each node, and
the IP addresses must use the same subnet as the public network.
What is the use of VIP?
If a node fails, then the node's VIP address fails over to another node
on which the VIP address can accept TCP connections but it cannot
accept Oracle connections.
Give situations under which VIP address failover happens:VIP addresses failover happens when the node on which the VIP address
runs fails, all interfaces for the VIP address fails, all interfaces
for the VIP address are disconnected from the network.
Using virtual IP we can save our TCP/IP timeout problem because Oracle
notification service maintains communication between each nodes and
listeners.
What is the significance of VIP address failover?
When a VIP address failover happens, Clients that attempt to connect to
the VIP address receive a rapid connection refused error .They don't
have to wait for TCP connection timeout messages.
What is voting disk?
Voting Disk is a file that sits in the shared storage area and must be
accessible by all nodes in the cluster. All nodes in the cluster
register their heart-beat information in the voting disk, so as to
confirm that they are all operational. If heart-beat information of any
node in the voting disk is not available that node will be evicted from
the cluster. The CSS (Cluster Synchronization Service) daemon in the
clusterware maintains the heartbeat of all nodes to the voting disk.
When any node is not able to send heartbeat to voting disk, then it
will reboot itself, thus help avoiding the split-brain syndrome.
For high availability, Oracle recommends that you have a minimum of
three or odd number (3 or greater) of votingdisks.
Voting Disk - is file that resides on shared storage and Manages
cluster members. Voting disk reassigns cluster ownership between the
nodes in case of failure.

The Voting Disk Files are used by Oracle Clusterware to determine which
nodes are currently members of the cluster. The voting disk files are
also used in concert with other Cluster components such as CRS to
maintain the clusters integrity.
Oracle Database 11g Release 2 provides the ability to store the voting
disks in ASM along with the OCR. Oracle Clusterware can access the OCR
and the voting disks present in ASM even if the ASM instance is down.
As a result CSS can continue to maintain the Oracle cluster even if the
ASM instance has failed.
How many voting disks are you maintaining ?
By default Oracle will create 3 voting disk files in ASM.
Oracle expects that you will configure at least 3 voting disks for
redundancy purposes. You should always configure an odd number of
voting disks >= 3. This is because loss of more than half your voting
disks will cause the entire cluster to fail.
You should plan on allocating 280MB for each voting disk file. For
example, if you are using ASM and external redundancy then you will
need to allocate 280MB of disk for the voting disk. If you are using
ASM and normal redundancy you will need 560MB.
Why we need to keep odd number of voting disks ?
Oracle expects that you will configure at least 3 voting disks for
redundancy purposes. You should always configure an odd number of
voting disks >= 3. This is because loss of more than half your voting
disks will cause the entire cluster to fail.

What are Oracle RAC software components?


Oracle RAC is composed of two or more database instances. They are
composed of Memory structures and background processes same as the
single instance database.Oracle RAC instances use two processes
GES(Global Enqueue Service), GCS(Global Cache Service) that enable
cache fusion.Oracle RAC instances are composed of following background
processes:
ACMSAtomic Controlfile to Memory Service (ACMS)
GTX0-jGlobal Transaction Process
LMONGlobal Enqueue Service Monitor
LMDGlobal Enqueue Service Daemon
LMSGlobal Cache Service Process
LCK0Instance Enqueue Process
RMSnOracle RAC Management Processes (RMSn)

RSMNRemote Slave Monitor


What are Oracle database background processes specific to RAC?
LMSGlobal Cache Service Process
LMDGlobal Enqueue Service Daemon
LMONGlobal Enqueue Service Monitor
LCK0Instance Enqueue Process
Oracle RAC instances use two processes, the Global Cache Service (GCS)
and the Global Enqueue Service (GES). The GCS and GES maintain records
of the statuses of each data file and each cached block using a Global
Resource Directory (GRD). The GRD contents are distributed across all
of the active instances.
What is Cache Fusion?
Transfor of data across instances through private interconnect is
called cachefusion.Oracle RAC is composed of two or more instances.
When a block of data is read from datafile by an instance within the
cluster and another instance is in need of the same block,it is easy to
get the block image from the insatnce which has the block in its SGA
rather than reading from the disk. To enable inter instance
communication Oracle RAC makes use of interconnects. The Global Enqueue
Service(GES) monitors and Instance enqueue process manages the cahce
fusion
What is SCAN? (11gR2 feature)
Single Client Access Name (SCAN) is s a new Oracle Real Application
Clusters (RAC) 11g Release 2 feature that provides a single name for
clients to access an Oracle Database running in a cluster. The benefit
is clients using SCAN do not need to change if you add or remove nodes
in the cluster.
SCAN provides a single domain name via (DNS), allowing and-users to
address a RAC cluster as-if it were a single IP address. SCAN works by
replacing a hostname or IP list with virtual IP addresses (VIP).
Single client access name (SCAN) is meant to facilitate single name for
all Oracle clients to connect to the cluster database, irrespective of
number of nodes and node location. Until now, we have to keep adding
multiple address records in all clients tnsnames.ora, when a new node
gets added to or deleted from the cluster.
Single Client Access Name (SCAN) eliminates the need to change TNSNAMES
entry when nodes are added to or removed from the Cluster. RAC
instances register to SCAN listeners as remote listeners. Oracle
recommends assigning 3 addresses to SCAN, which will create 3 SCAN
listeners, though the cluster has got dozens of nodes.. SCAN is a
domain name registered to at least one and up to three IP addresses,
either in DNS (Domain Name Service) or GNS (Grid Naming Service). The

SCAN must resolve to at least one address on the public network. For
high availability and scalability, Oracle recommends configuring the
SCAN to resolve to three addresses.
What are SCAN components in a cluster?
1.SCAN Name
2.SCAN IPs (3)
3.SCAN Listeners (3)
What is FAN?
Fast application Notification as it abbreviates to FAN relates to the
events related to instances,services and nodes.This is a notification
mechanism that Oracle RAc uses to notify other processes about the
configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to
FAN events and take immediate action.
What is TAF?
TAF (Transparent Application Failover) is a configuration that allows
session fail-over between different nodes of a RAC database cluster.
Transparent Application Failover (TAF). If a communication link failure
occurs after a connection is established, the connection fails over to
another active node. Any disrupted transactions are rolled back, and
session properties and server-side program variables are lost. In some
cases, if the statement executing at the time of the failover is a
Select statement, that statement may be automatically re-executed on
the new connection with the cursor positioned on the row on which it
was positioned prior to the failover.
After an Oracle RAC node crashesusually from a hardware failureall
new application transactions are automatically rerouted to a specified
backup node. The challenge in rerouting is to not lose transactions
that were "in flight" at the exact moment of the crash. One of the
requirements of continuous availability is the ability to restart inflight application transactions, allowing a failed node to resume
processing on another server without interruption. Oracle's answer to
application failover is a new Oracle Net mechanism dubbed Transparent
Application Failover. TAF allows the DBA to configure the type and
method of failover for each Oracle Net client.
TAF architecture offers the ability to restart transactions at either
the transaction (SELECT) or session level.
What are the requirements for Oracle Clusterware?
1. External Shared Disk to store Oracle Cluster ware file (Voting Disk
and Oracle Cluster Registry - OCR)
2. Two netwrok cards on each cluster ware node (and three set of IP
address) Network Card 1 (with IP address set 1) for public network

Network Card 2 (with IP address set 2) for private network (for inter
node communication between rac nodes used by clusterware and rac
database)
IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for
client connection and for connection failover)
3. Storage Option for OCR and Voting Disk - RAW, OCFS2 (Oracle Cluster
File System), NFS, ..
Which enable the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application
connections across all of the instances in an Oracle RAC database.
How to find location of OCR file when CRS is down?
If you need to find the location of OCR (Oracle Cluster Registry) but
your CRS is down.
When the CRS is down:
Look into ocr.loc file, location of this file changes depending on
the OS:
On Linux: /etc/oracle/ocr.loc
On Solaris: /var/opt/oracle/ocr.loc
When CRS is UP:
Set ASM environment or CRS environment then run the below command:
ocrcheck
In 2 node RAC, how many NICs are r using ?
2 network cards on each clusterware node
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter
node communication between rac nodes used by clusterware and rac
database)
In 2 node RAC, how many IPs are r using ?
6 - 3 set of IP address
## eth1-Public: 2
## eth0-Private: 2
## VIP: 2
How to find IPs information in RAC ?
Edit the /etc/hosts file as shown below:
# Do not remove the following line, or various programs
# that requires network functionality will fail.
127.0.0.1
localhost.localdomain localhost
## Public Node names
192.168.10.11
node1-pub.hingu.net
node1-pub
192.168.10.22
node2-pub.hingu.net
node2-pub
## Private Network (Interconnect)
192.168.0.11
node1-prv
node1-prv
192.168.0.22
node2-prv
node2-prv

## Private Network (Network Area storage)


192.168.1.11
node1-nas
192.168.1.22
node2-nas
192.168.1.33
nas-server
## Virtual IPs
192.168.10.111
node1-vip.hingu.net
192.168.10.222
node2-vip.hingu.net

node1-nas
node2-nas
nas-server
node1-vip
node2-vip

What is difference between RAC ip addresses ?


Public IP address is the normal IP address typically used by DBA and SA
to manage storage, system and database. Public IP addresses are
reserved for the Internet.
Private IP address is used only for internal clustering processing
(Cache Fusion) (aka as interconnect). Private IP addresses are reserved
for private networks.
VIP is used by database applications to enable fail over when one
cluster node fails. The purpose for having VIP is so client connection
can be failover to surviving nodes in case there is failure

Can application developer access the private ip ?


No. private IP address is used only for internal clustering processing
(Cache Fusion) (aka as interconnect)

You might also like