You are on page 1of 8

Problem

What is a semaphore and a semaphore timeout?


In a multitasking environment there is often a requirement to synchronize the execution
of various tasks or ensure one process has been completed before another begins. This
requirement is facilitated by the use of a software switch known as a Semaphore or a
lag. The function of this is to work in much the same way a railway signal would! only
allowing one train on the track at a time. " semaphore timeout is where the railway signal
has been set in one state too long# maybe because the train has broken down.
$xample of a semaphore timeout in %otes&'omino
"n example of this in %otes&'omino is when the indexer needs to completely rebuild an
index# it locks a semaphore so that other tasks cannot use the index until it is rebuilt. If a
user task now tries to open that index while it is being rebuilt# it will have to wait for the
indexer to finish the rebuild and then unlock the semaphore. "s a result# the user task is
stuck until that semaphore is unlocked. (hile it is stuck waiting for the semaphore# it
keeps track of how long it has been waiting. If it is stuck for more than )* seconds# this is
considered a semaphore timeout and in debug mode a message will be logged to the
console. The task will continue to wait for the semaphore# timing out every )* seconds#
until the semaphore is unlocked or the task is ended. or most operations# a task might
only wait a few microseconds and hence not time+out. (ith a complicated view on a
large database# the task may have to wait several minutes for the index semaphore.
If an important semaphore is locked by a task and is never unlocked# all tasks can be
stopped waiting for that semaphore. This can happen in several different ways. The most
common is where a task locks the semaphore and then crashes. This can also happen if a
task locks the semaphore and then goes into an endless loop or it gets an error and forgets
to unlock it.
Semaphore deadlock can occur when two tasks try to lock two different semaphores in a
different order. or example# Task " locks Semaphore , and then tries to lock Semaphore
-. In the meantime# Task . has already locked Semaphore - and is now trying to lock
Semaphore ,. Task " is stuck waiting for Semaphore - and Task . is waiting for
Semaphore , ++ deadlock.
/easons for semaphore timeouts
(hen you receive semaphore timeout messages# the messages are usually the result of
one of the following0
,. " heavy load on the server is causing processes to be delayed from releasing
semaphores.
-. " process has crashed while holding a semaphore# causing other processes to block
when trying to acquire the semaphore.
). 'eadly embrace# semaphore contention where two tasks are waiting on each other and
neither task is able to break the loop. In the simplest case# thread " is trying to get a
semaphore which is owned by .# while . is trying to get a different semaphore which is
owned by ". 1ore complex combinations are also possible0 " wants a semaphore owned
by .# who wants a semaphore owned by 2# who wants a semaphore owned by "# etc.
3. If a process was to fail to set a semaphore during execution# another process dependent
on the semaphore will be blocked awaiting the semaphore.
'etermining if a semaphore timeout has occurred
If an issue is a semaphore issue# it will be reported in two ways0
,. Sem.Timeouts
"t the server console# you will see the Sem.Timeouts statistic. To view this statistic# type
the following0
sh stat sem.timeouts
If the problem has occurred# you will see something similar to the following# depending
upon the nature of the semaphore timeout
Sem.Timeouts 4 3)*'056 *",)03- *)*.0-6 *,,70-7 *",-0-,
%8T$0 The statistic Sem.Timeouts will not appear in the Statrep if the customer is not
experiencing semaphore timeouts.
The first number is the semaphore I'. The I' tells us what the semaphore is used for. or
example# *x*)*. is the collection semaphore used by %I. The second number is a
decimal number which shows the number of times the semaphore timeout occurred. It is
not unusual in a heavily used server to get semaphore timeouts.
The following table is a list of some of the semaphores you may encounter0
Semaphore unction
*-33 %S per+database semaphore
3,* 8S ile system semaphore
*-77 %S per+database full+text semaphore
3-35 %S database opening semaphore
*)*. %I collection semaphore
3)*' %"1$9ookup semaphore
*",) 9og commit semaphore
*,,7 console semaphore
*",- .uffered log package semaphore
*)*. 2ollection semaphore
3,,: ;andle table free chain consistency semaphore
*"*. Session table semaphore
3,,) ;andle table movement semaphore
))d5 %ew to 'omino 3.5# "dminP<s semaphore for "29 modification
3-5) internal control semaphore
5:*6 .S"$ semaphore =/S" encryption stuff>
*-55 %S .+Tree semaphore
*-?3 'irectory 1anager @ueue semaphore
,,-* Transfer queue lock semaphore
-. $rror0 ASession semaphore held for BnC secondsA
The second way of determining if there is a semaphore issue# is when you see =at the
server console> the following message0
ASession semaphore held for BnC seconds.A
%ote0 This error does not print to the log file.
Troubleshooting Semaphore Timeouts
or 'omino releases 3.5.3 and higher# 3.7., and higher# and 5.x# the following parameters
can be added to the %8T$S.I%I file0
%8T$0 .efore enabling any debug parameters# it is imperative that you discuss them
with a 9otus %otes Support "nalyst beforehand. There may be issues surrounding their
use# or special precautions that must be considered. These debug parameters may require
a large amount of disk space# dependent upon when the server encounters problems! the
longer the server stays up# the larger the debug files will be. These files can grow large
enough to cause disk space shortages.
'ebugD2aptureDTimeout4,
'ebugDShowDTimeout4,
EE In /5.*.? and later releases# setting A'ebugD2aptureDTimeout4,*A will include time
and date information in the S$1'$.FG.THT. This can be extremely useful in many
cases. See the following technote for additional information0
A;ow to Turn on Semaphore 'ebugging Parameters in the %8T$S.I%I for 'ominoA
=I,7*?6)>
"fter adding these parameters# if semaphores are generated# they will be captured in a file
called S$1'$.FG.THT in the 'omino program directory0
8utput of S$1'$.FG.THT0
T;/$"' B*-".0*,-)C ("ITI%G 8/ /(S$1 *x*)*. 2ollection semaphore
=J*.$75"-*>
=/4*#(4,#(/IT$/4**6*0*,5$#,ST/$"'$/4****0****> 8/ )**** ms
T;/$"' B*,$0*--*C ("ITI%G 8/ /(S$1 *x*-33 open database semaphore
=J**$2.''->
=/4*#(4,#(/IT$/4**6*0*,5$#,ST/$"'$/4****0****> 8/ )**** ms
8utput of S$1'$.FG.THT for Fnix0
T;/$"' B*,7:70****,C ("ITI%G 8/ /(S$1 *x3,-2 =J$$,**-,*>
=/4*#(4,#(/IT$/4*5*7:0****,#,ST/$"'$/4*5*7:0****,> 8/ )**** ms
T;/$"' B*,7630****,C ("ITI%G 8/ /(S$1 *x3,-2 =J$$,**-,*>
=/4*#(4,#(/IT$/4*5*7:0****,#,ST/$"'$/4*5*7:0****,> 8/ )**** ms
(hat does the output meanK
*x*)*. + Indicates the type of semaphore
B*,$0*--*C
The first number =*,$> indicates the process I'. This number is in hex and must be
converted to decimal. This number
will be completely different when the server is rebooted.
The second number =*--*> is the thread I'.
(hat does the output mean =in Fnix example>0
*x3,-2 + Indicates the type of semaphore
B*,7630****,C
The first number =*,763> indicates the process I'. This number is the PI' as indicated in
the %S' output and will be completely different when the server is rebooted.
The second number =****,> is the thread I'.
%S' output0
%S' output0
B,C ,7630 &opt&lotus&notes&latest&sunspa&tmmscan L+++ fatal thread
IIIIII thread ,&3 00 tmmscan# pid4,763# lwp4, IIIIII
B,C eed)?7a* lwpDsemaDp =--6a)*>
B-C eed)?7a* DDlwpDsemaDwait =--6a)*# ,d7:*# *# *# *# *> M 6
B)C eefc::?c Dpark =--6??*# --6a)*# *# ,# eefe7-3*# *> M a*
B3C eefc:553 Dswtch =--6?a*# --6b?*# --6a,*# --6a*c# --6a*6# --6a*3> M -cc
B5C eefc5f6c DcondDtimedwaitDcancel =efffdd?*# efffdd:6# efffdd:*# --6??*# eefe5-b*# *>
M ,e3
B7C eefd)3-* DtiDsleep =,e# eefe5-b*# eefe5-b*# effff-e)# effff,53# f6***7**> M ,**
B:C ef*::7e3 fatalDerror =b# ef77)dd3# ef75:5?6# efffde5*# --6a*3# --6?e3> M -c*
B6C eefd-f-* DDlibthreadDsegvhdlr =b# efffe3b*# efffe,f6# efffe,)6# eefe5-b*# --6?e3> M
e*
B?C eefd-))3 sigacthandler =b# efffe3b*# --6??*# eefe5-b*# efffe,f6# eefd-e3*> M 7e*
B,*C ***-*,d* N2DScanStart =efffeec*# )ae,6# efffeec*# efffeec3# efffeec6# efffeecc> M
,)a*
B,,C ***,5:b* "ddIn1ain =--# ,# effff,33# ef:ed-b6# ef:ec?c*# *> M d**
B,-C ***-fb3* %otes1ain =,# effff,33# effff,33# *# *# ef:c,7e,> M 3*
B,)C ***-fa7* notesDmain =*# *# *# ,# effff,33# *> M a6
B,3C ***,3:ec Dstart =*# *# *# *# *# *> M dc
rom the Process Tree section0
rom the Process Tree section0
username status pid program
notes*7 / ,76- &opt&lotus&notes&latest&sunspa&http
notes*7 / ,763 &opt&lotus&notes&latest&sunspa&tmmscan
(hat other information should be collectedK
or %T0
,. " screenshot of the (indows %Ts Task 1anager# or the output of the PST"T utility in
the %T resource kit# redirected to a file. =PST"T O somefile.txt>
It is important to collect a screen capture of Task 1anager or the PST"T output
immediately when the semaphores begin to scroll on the 'omino server console. (hen
matching a process I' from S$1'$.FG.THT to Task 1anager# it is important to take
the I' from the first semaphore error that is generated in the file. This is usually the
offending process.
-. Issue a AShow Task 'ebugA command as soon as the semphores begin to scroll on the
'omino server console.
$xample0 sh task debugOdebug.txt
or Fnix0
,. /un %S' under root. 1ake sure ANillProcess4,A is removed from the %8T$S.I%I to
prevent the server from killing the process when the server crashes# thus preventing
mapping of PI' and semdebug.txt.
It is important to collect an %S' immediately when the semaphores begin to scroll on the
'omino server console. (hen matching a process I' from S$1'$.FG.THT to %S' +
Process Tree# it is important to take the I' from the first semaphore error that is
generated in the file. This is usually the offending process.
-. Issue a AShow Task 'ebugA command as soon as the semphores begin to scroll on the
'omino server console.
$xample0 sh task debugOdebug.txt
Please contact %otes Technical Support if you need further assistance.
How to utilize SEMDEBUG.TXT and Task Manaer
8nce the S$1'$.FG.THT file is collected# it is the I/ST line that is the most
important in determining the problem. This line usually contains the offending process.
Specifically we want to look at the 8(%$/ process.
2aveat0 There is no time&date stamp in the file =unless the 'ebugD2aptureDTimeout4,*
parameter is used for a 'omino 5.*.? or later server>.
The first number in brackets is the Process I' =PI'>. The second number =after the colon>
is the thread I'. The Process I' must be converted from hex to decimal.
In the excerpt below the 8(%$/ is PI' **?". PI' **?" converts to a decimal number
of ,53 =Process **5) is waiting for process **?" to release the semaphore>. (hen
matched up to igures , and - below =screen shots of %T Task 1anager># it is apparent
that "gent 1anager is the offending process.
Sample S$1'$.FG.THT
T;/$"' B**5)0**-C ("ITI%G 8/ S$1 *x*3-' 9SI(/"P shared semaphore
=J*,)3-276> =8(%$/4**?"0**??> 8/ ,-* ms
T;/$"' B**350**56C ("ITI%G 8/ S$1 *x*3-' 9SI(/"P shared semaphore
=J*,)3-276> =8(%$/4**?"0**??> 8/ ,-*
T;/$"T;/$"' B**"30**")C ("ITI%G 8/ S$1 *x*3-' 9SI(/"P shared
semaphore =J*,)3-276> =8(%$/4**?"0**??> 8/ ,-* ms
T;/$"' B**?$0**?'C ("ITI%G 8/ S$1 *x*3-' 9SI(/"P shared semaphore
=J*,)3-276> =8(%$/4****0****> 8/ ,-* ms
T;/$"' B**350**56C ("ITI%G 8/ S$1 *x*3-' 9SI(/"P shared semaphore
=J*,)3-276> =8(%$/4**?$0**?'> 8/ ,-* ms
T;/$"' B**"*0**?C ("ITI%G 8/ S$1 *x*3-' 9SI(/"P shared semaphore
=J*,)3-276> =8(%$/4**?$0**?'> 8/ ,-* ms
T;/$"' B**"-0**",C ("ITI%G 8/ S$1 *x*3-' 9SI(/"P shared semaphore
=J*,
igure , =Task 1anager>0
%ote0 If you are using a browser or client that is not displaying igures , and - below#
please reference the document titled A;ow To Ftilize S$1'$.FG.THT and Task
1anagerA =I,:5)63># where you should be able to view the figures.
Figure 2 (Task Manager):

You might also like