You are on page 1of 9

Problem

SQL backups fail with status code 24, socket write failed

Error
08:44:18.821 [3796.284] <16> bpcd main: the server is not allowed to write to files on the client

Solution
Issue: SQL backups fail with status code 24, socket write failed

Logs: <install_path>\NetBackup\logs\bpcd log on the client shows that the server is not allowed access and then terminates. 08:44:18.821 [3796.284] <16> bpcd main: the server is not allowed to write to files on the client

Solution: Make sure the 'Allow server-directed restores' box is checked on the NetBackup client. This can be found under the Backup, Archive, and Restore console > File > NetBackup Client Properties > General tab. See figure's 1 and 2 below. Figure 1

Figure 2

Problem
BUG REPORT: Restores to Linux clients fail with NetBackup Status Code 24 (socket write failed).

Solution
Bug: 1078466 - Disaster Recovery of 4 TB all-in-one via NBU fails during restore with socket errors 1202564 - During restore on Linux, tar is memory leaking in 6.5 and 6.5.1. Symptom(s): When restoring data back to a Linux client, the restore fails with NetBackup Status Code 24, socket write failed. The output of the ps-elf command from both the beginning and the end of the restore show an increase in the memory used by tar: From the start of the restore (memory used listed in bold):
0 S root 15348 15347 0 78 0 /usr/openv/netbackup/bin/tar.bin 1166 14:07 ? 00:00:00

From the end of the restore (memory used listed in bold):

0 T root 15348 15347 25 75 0 /usr/openv/netbackup/bin/tar.bin

6457 utrace 14:07 ?

00:00:27

This memory is increasing, and not being properly released. Log Files: N/A Workaround: If the environment is affected by this issue, and the files cannot be restored, contact Symantec support for additional assistance with the restore issues, and reference this technote.

ETA of Fix: Symantec Corporation has acknowledged that the above-mentioned issue is present in the current version(s) of the product(s) mentioned at the end of this article. Symantec Corporation is committed to product quality and satisfied customers. This issue is currently being considered by Symantec Corporation to be addressed in a forthcoming Maintenance Pack, Release Update, or version of the product. The fix for this issue is expected to be released in the second quarter of 2008. Please note that Symantec Corporation reserves the right to remove any fix from the targeted release if it does not pass quality assurance tests or introduces new risks to overall code stability. Symantec's plans are subject to change and any action taken by you based on the above information or your reliance upon the above information is made at your own risk. Please refer to the maintenance pack readme or contact NetBackup Enterprise Support to confirm this issue (ET1078466 and ET1202564) was included in the maintenance pack.

Problem
STATUS CODE 24: When using Veritas NetBackup for Microsoft SQL Server to restore the master database for Microsoft SQL Server from an alternate client, the restore fails with a NetBackup Status Code 24 (socket write failed).

Error
socket write failed

Solution
Overview: When using Veritas NetBackup for Microsoft SQL Server to restore the master database for Microsoft SQL Server from an alternate client, the restore fails with a NetBackup

Status Code 24 (socket write failed). Troubleshooting: Enable the dbbackup log file on the Microsoft SQL server. Log files: The following is observed in the dbbackup log (bold added for clarity):
16:13:32 [1984,2500] <16> CODBCaccess::LogODBCerr: DBMS MSG - ODBC message. ODBC return code <-1>, SQL State <37000>, Message Text <[Microsoft][ODBC SQL Server Driver][SQL Server]The backup of the system database on device VNBU01984-2500 cannot be restored because it was created by a different version of the server (134218112) than this server (134217922).>

Resolution: Bring the destination Microsoft SQL Server to the same MS SQL service pack level as the original MS SQL server that created the back image. Once this is completed, re-run the restore.

Problem
STATUS CODE 24: The error "socket write failed" appear during backups performed with Veritas NetBackup (tm).

Error
EXIT STATUS 24: socket write failed

Solution
Overview: The transmission control protocol (TCP) network parameter tcp_ip_abort_interval may cause this error if it has been tuned incorrectly. The tcp_ip_abort_interval is the total retransmission timeout value for a TCP connection in milliseconds. For a given TCP connection, if TCP has been retransmitting for tcp_ip_abort_interval period of time and it has not received any acknowledgment from the other endpoint during this period, TCP closes this connection. By default, the tcp_ip_abort_interval parameter is 480000 milliseconds (8 minutes).

Troubleshooting: To obtain the current tcp_ip_abort_interval parameter value, the following command can be run. This is an operating system command and will be found in one of the system directories, depending on the platform. For example, /usr/sbin/ndd can be found on Solaris systems.
# ndd -get /dev/tcp tcp_ip_abort_interval

When tuning the tcp_ip_abort_interval, the following TCP network parameter values must also be taken into consideration: tcp_rexmit_interval_initial: The initial retransmission timeout (RTO) value for a TCP connection in milliseconds. The default value is 3000 milliseconds (3 seconds). tcp_rexmit_interval_min: The minimum retransmission timeout (RTO) value in milliseconds. The default value is 400 milliseconds. tcp_rexmit_interval_max: The maximum retransmission timeout value (RTO) in milliseconds. The default value is 60000 milliseconds (60 seconds). To obtain the above current TCP parameter values, the following commands can be run:
# ndd -get /dev/tcp tcp_rexmit_interval_initial # ndd -get /dev/tcp tcp_rexmit_interval_min # ndd -get /dev/tcp tcp_rexmit_interval_max

Log Files: N/A Resolution: If the tcp_ip_abort_interval timer value is reduced to a value less than the tcp_rexmit_interval_max timer value or any other tcp_rexmit variable (shown above) then connections can get aborted. This is due to the tcp_ip_abort_interval timer expiring before the tcp_rexmit_interval_max (or other tcp_rexmit variable) timer is reached. When the tcp_ip_abort_interval timer value is reached, the TCP connection is closed (RESET signal). The TCP connection reset will be presented in the bpbkar log file as a "Errno = 32: Broken pipe" error message. This error message will then be followed with an "Exit status = 24: socket write failed" error message. If the tcp_ip_abort_interval parameter value must be reduced, the value should be at least four times greater than the tcp_rexmit_interval_max parameter value as recommended by Sun Microsystems. In addition, Sun Microsystems recommends the tcp_rexmit_interval_max value to be at least eight times the value of tcp_rexmit_interval_min. It is important to note that the inetd process needs to be restarted after modifying these parameters. If this does not occur, the current tcp_rexmit parameter values will be retained. The Sun Microsystems default TCP parameter values are adequate for the majority of servers and applications currently in use.

The default TCP parameter values should not be modified without adequate research and should follow Sun Microsystems recommendations.

Problem
Status Code 24 (socket write failed) error occurs during a restore to a Windows 2000 or Windows 2003 Client server when restoring from a multi-stream backup image.

Error
Status Code 24 (socket write failed)

Solution
Status Code 24 (socket write failed) occurred when attempting to restore multiple volumes on a single Windows 2000 or 2003 Client server that were backed up through a multi-stream backup job (i.e. each 'stream' was written to a separate tape). When performing restores, both volumes (on multiple tapes) are selected for restore and both tapes are in the Robotic Library. The first restore job mounts the tape and begins to write data to the Windows 2000 or 2003 destination client with no problems. When the second restore job mounts the second tape and begins to write data, both restore jobs will run for a few minutes and then one of the jobs will fail with Status Code 24; the other job will continue to run and will complete successfully. The job that fails is not always the same job (sometimes the first job fails and some times the second job fails). The jobs that fail, never fail at the same location on the tape. Solution: The above issue is related to the system resources on the destination Client where the multiple restore jobs are run. To allocate more resources for the restore jobs on Windows 2000 and 2003 Client servers, make the following system configuration change. 1. Click Start | Control Panel | System to open the System Properties dialog box (Figure 1) Figure 1

2. Select the Advanced tab and under Performance click Settings to open the Performance options dialog box (Figure 2) Figure 2

3. Edit the system Performance Options to optimize for Background services a. On Windows 2000 servers, select Application Response | Optimize performance for and select the Background services option b. On Windows 2003 servers, select the Advanced tab and change the Adjust for the best performance of: option to Background services 4. Click OK to close both dialog boxes and to save the changes made 5. Reboot the server and start the multiple restore jobs again

You might also like