Professional Documents
Culture Documents
Introduction
In this document, the goal was to provide clarification to different sections from the main MOS
Document 1084360.1 (bare metal restore). The notes will provide clarification during the execution of
certain commands or will provide workarounds to possible problems that can be reported.
Following commands are executed from the RDBMS Oracle Home. If using different owners, log as the
correct owner
2. Stop the VIP resources for the failed database server and delete
The image file used to reimage the compute node can be downloaded from Edelivery, an external site,
which is password protected. Those images can be also be provided by Oracle Support.
The most common methods to reimage the compute node are using ISO file image or using USB Flash
image.
Oracle Support can provide the ISO file image already generated and details where the file can be
obtained, normally the external ftp site ftp.oracle.com.
To generate the USB image, Oracle Support will provide the files required and the process will be
completed with the execution of next steps:
Insert a blank USB flash drive into a working database server in the cluster.
#cd dl360
Note: After 11.2.2.2.X, to make sure dualboot is forced to no, modify file
makeImageMedia.sh. Search for a line like ‘dualboot=’ and add no, like this:
dualboot=no
#./makeImageMedia.sh
Connect to the ILOM via web interface. Go to Remote Control tab , then Host Control tab. From
the Next Boot Device, select CDROM. Next the server is rebooted, it will use the ISO image
attached. This is valid for one time, which after the default BIOS order settings will remain.
Reboot the box and let the process pick the ISO image and start the re-image process
1. Insert the USB flash drive into the USB port on the replacement database server.
2. Log in to the console through the service processor, or by using the KVM switch to monitor
progress.
3. Power on the database server using either the service processor interface or by physically
pressing the power button.
4. Press F2 during BIOS and select BIOS Setup to configure boot order, or press F8 and select the
one-time boot selection menu to choose the USB flash drive.
5. Configure the BIOS boot order if the motherboard was replaced. The boot order should be USB
flash drive, then RAID controller.
6. Allow the system to boot. As the system boots, it detects the CELLUSBINSTALL media. The
imaging process has two phases. Let each phase complete before preceding to the next step.
The first phase of the imaging process identifies any BIOS or firmware that is out of data, and
upgrades the components to the expected level for the image. If any components need to be
upgraded or downgraded, then the system automatically reboots.
The second phase of the imaging process installs the factory image on the replacement
database server.
The first boot will ask for all the IPs, NTP, ILOM, etc. To get the information about IPs, file
/opt/oracle.cellos/cell.conf from a surviving node can be used.
Also all IPs normally are registered on DNS, command nslookup can be used to discover the IPs assigned
to the node.
If mistakes are made, ipconf can be used to modify the settings. ipconf can be used with options -
nocodes since 11.2.2.X
Section Prepare Replacement Database Server for The Cluster
• Before copying cellinit.ora and cellip.ora files, it requires creating the directories:
#mkdir -p /etc/oracle/cell/network-config
Set OH to the GI
MOS note 1298957.1 is available, to create a cron job that will be removing older files.
Known Issues
#cluvfy stage -pre nodeadd -n <lost node> -fixup -fixupdir <directory> returns
ERROR:
sclcbdb03
ERROR:
sclcbdb03
Solution:
set environment variable IGNORE_PREADDNODE_CHECKS="Y". This will not prevent the error on
cluvfy -pre nodeadd, but it will do when addNode.sh is executed.
The cause is bug 11719563, when Voting Disks are in ASM
was not correctly filtered out if it was an ASM path. This resulted in
2. When cloning GI. during execution of addNode.sh, it fails with permission problems
Errors:
dmorldb08:
PRCF-2023 : The following contents are not transferred as they are non-readable.
Directories:
Files:
1) /u01/app/11.2.0/grid/ccr/hosts/dmorldb07.us.oracle.com/log/collector.log
2) /u01/app/11.2.0/grid/ccr/hosts/dmorldb07.us.oracle.com/log/upgrade.log
3) /u01/app/11.2.0/grid/ccr/hosts/dmorldb07.us.oracle.com/log/sched.log
4) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/JDBCDataSource.pl
5) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/JDBCMultiDataSource.pl
6) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/JMSConnectionFactory.pl
7) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/JMSQueue.pl
8) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/JMSTopic.pl
9) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/JoltConnectionPool.pl
10) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/ResourceConfig.pl
11) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/Server.pl
12) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/ServerConfigUtil.pm
13)
/u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/StartupShutdownClasses.pl
14) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/VirtualHost.pl
15) /u01/app/11.2.0/grid/ccr/sysman/admin/scripts/ias/weblogic_j2eeserver/WorkManager.pl
----------------------------------------------------------------------------------
Solution:
Set correct permissions for the files listed above. Enable rx to owner,group,others and retry command
3. Other set of files with incorrect perms reported also by command addNode.sh
Solution:
DIAGNOSTIC
Run rds-ping from the node been restored to the other compute nodes and validate connectivity:
SOLUTION
• Validate rds-ping against all the compute nodes and storage cells