Contents
- Introduction
- What is Deadlock?
- The Dining Philosophers Program
- Finding Deadlocks Using the Thread Analyzer
- Understanding the Experiment Result
- Summary
This document presents a tutorial on using the Sun Studio Thread
Analyzer for detecting potential and actual deadlocks. The tutorial
will help you get started using the Thread Analyzer and understanding
deadlocks.
If you want a quick overview of the steps involved in using the Thread
Analyzer, refer to the Thread Analyzer Getting Started Guide.
Deadlock describes a condition where two or more threads are
blocked (hung) forever, waiting for each other. Suppose we have a
process with two or more threads. A deadlock occurs when the following
three conditions hold:
- Threads already holding locks request new locks,
- The requests are made concurrently, and
- Two or more threads form a circular chain where each thread waits
for a lock that the next thread in the chain holds.
Here is a simple example of a deadlock condition:
Thread 1: holds lock A, requests lock B
Thread 2: holds lock B, requests lock A
A deadlock can be of two types: A potentiaL deadlock or an
actual deadlock. A potential deadlock is a deadlock that did
not occur in a given run, but can occur in other runs of the program
depending on the scheduling of threads and the timings of the requests
for locks by the threads. An actual deadlock is one that actually
occured in a given run of the program. An actual deadlock causes the
threads involved to hang, but may or may not cause the whole process
to hang.
Throughout this document, we will use a multi-threaded program to
illustrate deadlocks and the use of the Sun Studio Thread Analyzer.
The program is a C program that uses POSIX threads to simulate the
dining philosophers problem. The source file is called
din_philo.c. The program can exhibit both potential and
actual deadlocks.
Here is a listing of the program din_philo.c:
Program din_philo.c
[Figure 1]
POSIX Thread Program for Simulating the Dining Philosophers Problem
1 #include
2 #include
3 #include
4 #include
5 #include
6 #include
7
8 #define PHILO 5
9 #define DELAY 5000
10 #define FOOD 50
11
12 pthread_mutex_t forks[PHILO];
13 pthread_t phils[PHILO];
14 void *philosopher (void *id);
15 int food_on_table ();
16 void get_fork (int, int, char *);
17 void down_forks (int, int);
18 pthread_mutex_t foodlock;
19
20 int sleep_seconds = 0;
21
22 int
23 main (int argn,
24 char **argv)
25 {
26 int i;
27
28 if (argn == 2)
29 sleep_seconds = atoi (argv[1]);
30
31 pthread_mutex_init (&foodlock, NULL);
32 for (i = 0; i < PHILO; i++)
33 pthread_mutex_init (&forks[i], NULL);
34 for (i = 0; i < PHILO; i++)
35 pthread_create (&phils[i], NULL, philosopher, (void *)i);
36 for (i = 0; i < PHILO; i++)
37 pthread_join (phils[i], NULL);
38 return 0;
39 }
40
41 void *
42 philosopher (void *num)
43 {
44 int id;
45 int i, left_fork, right_fork, f;
46
47 id = (int)num;
48 printf ("Philosopher %d sitting down to dinner.\n", id);
49 right_fork = id;
50 left_fork = id + 1;
51
52 /* Wrap around the forks. */
53 if (left_fork == PHILO)
54 left_fork = 0;
55
56 while (f = food_on_table ()) {
57
58 /* Thanks to philosophers #1 who would like to
59 * take a nap before picking up the forks, the other
60 * philosophers may be able to eat their dishes and
61 * not deadlock.
62 */
63 if (id == 1)
64 sleep (sleep_seconds);
65
66 printf ("Philosopher %d: get dish %d.\n", id, f);
67 get_fork (id, right_fork, "right");
68 get_fork (id, left_fork, "left ");
69
70 printf ("Philosopher %d: eating.\n", id);
71 usleep (DELAY * (FOOD - f + 1));
72 down_forks (left_fork, right_fork);
73 }
74 printf ("Philosopher %d is done eating.\n", id);
75 return (NULL);
76 }
77
78 int
79 food_on_table ()
80 {
81 static int food = FOOD;
82 int myfood;
83
84 pthread_mutex_lock (&foodlock);
85 if (food > 0) {
86 food--;
87 }
88 myfood = food;
89 pthread_mutex_unlock (&foodlock);
90 return myfood;
91 }
92
93 void
94 get_fork (int phil,
95 int fork,
96 char *hand)
97 {
98 pthread_mutex_lock (&forks[fork]);
99 printf ("Philosopher %d: got %s fork %d\n", phil, hand, fork);
100 }
101
102 void
103 down_forks (int f1,
104 int f2)
105 {
106 pthread_mutex_unlock (&forks[f1]);
107 pthread_mutex_unlock (&forks[f2]);
108 }
|
In the above program,
din_philo.c we have five philosophers,
numbered 0-4. The philosophers are seated in numerical order around a
round table in clockwise fashion. The philosophers are separated by
forks. The fork on the right of each philosopher has the same number
as the philosopher's own number. The fork on the left has a larger
number except for philosopher #4. Figure 2 shows how the philosophers
and forks are arranged around the table.
The Dining Philosophers and Their Forks
[Figure 2]
P0
f0 f1 P=Philosopher
f=fork
P4 P1
f4 f2
P3 P2
f3
|
Each philosopher will first reach for the fork on his right. Once a
philosopher has grabbed the fork on his right, he will try to grab the
fork on his left. When a philospher has grabbed both forks, he can
proceed to eat. After the philosopher has had some food, he will
place both forks back on the table, then repeat the whole process
again (reach for the fork on his right, etc.). The process is
repeated until there is no food for the philosopher to consume.
An actual deadlock occurs when every philosopher is holding the fork
on the right, and waiting for the fork on the left to become
available. In this situation, no philospher can proceed to eat and
the philosophers will wait forever, i.e. will deadlock.
In the program, philosopher #1 can be put to sleep for a specified
amount of time (sleep_seconds) before picking his forks. If
he sleeps long enough, then the program may finish without any actual
deadlock. The number of seconds philosopher #1 sleeps can be specified
as an argument to the excutable. If no argument is specified, then
the philospher will not sleep by default.
In our dining philosophers example, an actual deadlock occurs when
we have the following scenario:
Philosopher #0: holds fork #0, waits for fork #1
Philosopher #1: holds fork #1, waits for fork #2
Philosopher #2: holds fork #2, waits for fork #3
Philosopher #3: holds fork #3, waits for fork #4
Philosopher #4: holds fork #4, waits for fork #0
If you try to run the dining philosophers program, din_philo.c, the
program may hang as in Figure 3.
Running the Program:
[Figure 3]
% cc din_phil.c -mt
% a.out
Philosopher 0 sitting down to dinner.
Philosopher 2 sitting down to dinner.
Philosopher 1 sitting down to dinner.
Philosopher 0: get dish 49.
Philosopher 3 sitting down to dinner.
Philosopher 1: get dish 47.
Philosopher 4 sitting down to dinner.
Philosopher 1: got right fork 1
Philosopher 3: get dish 46.
Philosopher 1: got left fork 2
Philosopher 1: eating.
Philosopher 4: get dish 45.
Philosopher 3: got right fork 3
Philosopher 0: got right fork 0
Philosopher 4: got right fork 4
Philosopher 2: get dish 48.
Philosopher 2: got right fork 2
Philosopher 1: get dish 44.
Philosopher 0: got left fork 1
Philosopher 0: eating.
...
Philosopher 2: get dish 5.
Philosopher 1: get dish 4.
Philosopher 2: got right fork 2
Philosopher 0: got left fork 1
Philosopher 0: eating.
Philosopher 1: got right fork 1
Philosopher 4: got left fork 0
Philosopher 4: eating.
Philosopher 0: get dish 3.
Philosopher 0: got right fork 0
Philosopher 4: get dish 2.
Philosopher 4: got right fork 4
(hang)
Execution terminated by pressing CTRL-C
|
If you rerun the program a number of times, you will see that the
program may sometimes hang (as shown in Figure 3), while the program
may run to completion at other times. Try running the program several
times. Also try specifying different sleep arguments to the
executable a.out.
Figure 4 shows a run where a sleep argument of 30 seconds is
specified, indicating that philosopher philosopher #1 is to sleep for
30 seconds before reaching for his right-hand fork. The program in
Figure 4 runs to completion and all five philosophers are done eating.
Running the Program with a Sleep Argument:
[Figure 4]
% a.out 30
Philosopher 0 sitting down to dinner.
Philosopher 2 sitting down to dinner.
Philosopher 2: get dish 48.
Philosopher 3 sitting down to dinner.
Philosopher 3: get dish 47.
Philosopher 0: get dish 49.
Philosopher 0: got right fork 0
Philosopher 0: got left fork 1
Philosopher 4 sitting down to dinner.
Philosopher 4: get dish 46.
Philosopher 4: got right fork 4
Philosopher 0: eating.
Philosopher 3: got right fork 3
Philosopher 2: got right fork 2
Philosopher 1 sitting down to dinner.
Philosopher 1: get dish 45.
Philosopher 1: got right fork 1
Philosopher 4: got left fork 0
Philosopher 4: eating.
Philosopher 0: get dish 44.
...
Philosopher 4: got right fork 4
Philosopher 3: got right fork 3
Philosopher 1: got left fork 2
Philosopher 1: eating.
Philosopher 2: get dish 1.
Philosopher 1 is done eating.
Philosopher 0: got left fork 1
Philosopher 0: eating.
Philosopher 2: got right fork 2
Philosopher 0 is done eating.
Philosopher 4: got left fork 0
Philosopher 4: eating.
Philosopher 4 is done eating.
Philosopher 3: got left fork 4
Philosopher 3: eating.
Philosopher 3 is done eating.
Philosopher 2: got left fork 3
Philosopher 2: eating.
Philosopher 2 is done eating.
%
Execution terminated normally
|
You can use the thread Analyzer to investigate potential and actual
deadlocks in your program. The Thread Analyzer follows the same
"collect-analyze" model that the Sun Studio Performance Analyzer
uses. There are three steps involved in using the Thread Analyzer:
- 3.1 Compile the Source Code
- 3.2 Create an Experiment
- 3.3 Examine the Experiment
These three steps are described below.
3.1 Compile the Source Code
It is recommended that you use
-g and no
optimization level when compiling your program for error checking by
the thread analyzer. Compile an OpenMP program with
-g
-xopenmp=noopt, and compile a POSIX threads program with just
-g. The information reported, such as line numbers and
callstacks, may be incorrect when a high optimization level is used.
3.2 Create an Experiment
Use the collect command with the -r deadlock option
to run the program and create a deadlock-detection experiment during
the execution of the process.
Note: To increase the likelihood of detecting deadlocks, it
is recommended that you create several deadlock-detection experiments
using collect with the -r deadlock option. Use a
different number of threads and different input data in the different
experiments.
3.3 Examine the Experiment
A deadlock-detection experiment can be examined with the tha
command, the analyzer command, or the er_print
utility.
Both the tha and the analyzer commands present a GUI
interface; the former presents a simplified set of default tabs, but
is otherwise identical to the analyzer.
The tha GUI has a menu bar, a tool bar, and a split pane that
contains tabs for the various displays. On the left-hand pane, you
will notice that the following three tabs are shown by default:
- The Deadlocks tab. Shows a list of potential and
actual deadlocks detected in the program. This tab is selected by
default. For each deadlock, the total number of threads involved in
the deadlock is displayed, in addition, the threads participating in
the deadlock are shown. The threads form a circular chain where each
thread holds a lock and requests a lock that the next thread in the
chain holds.
- The Dual Source tab. If a threads in the circular
chain is selected, then the Dual Source tab shows the two source
locations where the thread held a lock and requested a lock. The
source line where a thread is held or requested will appear
highlighted.
- The Experiments tab. Shows the load objects in the
experiment, and lists error and warning messages.
On the right-hand pane of the tha display, the following
two tabs are shown:
- The Summary tab. Shows summary information about a
data race access selected from the Races tab.
- The Deadlocks Details tab. Shows detailed
information about a deadlock selected from the Deadlocks tab.
The er_print utility, on the other hand, presents a
command-line interface. The subcommands that are generally useful for
examining deadlocks with er_print are:
deadlocks
Report any potential and actual detected in the experiment.
ddetail deadlock_id
Display detailed information about the deadlock with the specified
deadlock_id. If the specified
deadlock_id is
all, then detailed information
about all deadlocks will be displayed.
header
Display descriptive information about the experiment, and report any
errors and warnings.
Refer to the
collect(1),
collector(1),
tha(1),
analyzer(1),
and
er_print(1)
man pages for more information.
The following examples show the steps involved in compiling the
program din_philo.c, creating a deadlock-detection
experiment, and using er_print or the tha GUI to
display actual and potential deadlocks.
First, we will look at a run where an actual deadlock occurs and the
program hangs; the thread Analyzer tool can be used to investigate the
cause of the actual deadlock. Second, we will look at a run where the
program terminates normally; the Thread Analyzer in this case is able
to detect a potential deadlock.
4.1 Actual Deadlock
The following run of the program (Figure 5) results in an actual
deadlock. The
er_print and
tha GUI provide
information about the deadlock.
[Figure 5]
% cc din_philo.c -mt -g
% collect -r deadlock a.out
Creating experiment database tha.1.er ...
Philosopher 0 sitting down to dinner.
Philosopher 1 sitting down to dinner.
Philosopher 2 sitting down to dinner.
Philosopher 0: get dish 49.
Philosopher 2: get dish 48.
Philosopher 1: get dish 47.
Philosopher 2: got right fork 2
Philosopher 3 sitting down to dinner.
Philosopher 1: got right fork 1
Philosopher 0: got right fork 0
Philosopher 2: got left fork 3
Philosopher 2: eating.
Philosopher 4 sitting down to dinner.
Philosopher 3: get dish 46.
Philosopher 4: get dish 45.
Philosopher 4: got right fork 4
Philosopher 3: got right fork 3
Philosopher 1: got left fork 2
Philosopher 1: eating.
Philosopher 2: get dish 44.
...
Philosopher 2: eating.
Philosopher 2: get dish 4.
Philosopher 1: got left fork 2
Philosopher 1: eating.
Philosopher 3: got right fork 3
Philosopher 2: got right fork 2
Philosopher 0: got left fork 1
Philosopher 0: eating.
Philosopher 1: get dish 3.
Philosopher 1: got right fork 1
Philosopher 0: get dish 2.
Philosopher 0: got right fork 0
(hang)
Execution terminated by pressing CTRL-C
% er_print tha.1.er
(er_print) deadlocks
Deadlock #1, Potential deadlock
Thread #2
Lock being held: 0x21590, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215a8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #3
Lock being held: 0x215a8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215c0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #4
Lock being held: 0x215c0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215d8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #5
Lock being held: 0x215d8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215f0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #6
Lock being held: 0x215f0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x21590, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Deadlock #2, Actual deadlock
Thread #2
Lock being held: 0x21590, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215a8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #3
Lock being held: 0x215a8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215c0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #4
Lock being held: 0x215c0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215d8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #5
Lock being held: 0x215d8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215f0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #6
Lock being held: 0x215f0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x21590, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Deadlocks List Summary: Experiment: tha.1.er Total Deadlocks: 2
(er_print)
|
The following screen-shot shows the deadlocks detected in
din_philo.c displayed through the tha GUI, using
the command:
% tha tha.1.er
Click on the image to see it full sized
Looking at the Deadlocks tab, we find that two deadlocks are reported
for din_philo.c, one potential and the other actual. On
closer inspection, we find that the two deadlocks are actually
identical. The circular chain involved in the deadlock is as follows:
Thread 2: holds lock at address 0x21590, requests lock at address 0x215a8
Thread 3: holds lock at address 0x215a8, requests lock at address 0x215c0
Thread 4: holds lock at address 0x215c0, requests lock at address 0x215d8
Thread 5: holds lock at address 0x215d8, requests lock at address 0x215f0
Thread 6: holds lock at address 0x215f0, requests lock at address 0x21590
If you select the first deadlock in the list (Thread #2) and then
click on the Dual Source tab, you will see where in the source code
Thread #2 held the lock at address 0x21590, and where in the source
code it requested the lock at address 0x215a8. The following
screen-shot shows the Dual Source tab for Thread #2.
Click on the image to see it full sized.
4.2 Potential Deadlock
If program din_philo.c is run with a large enough sleep
argument, it may be possible to avoid actual deadlock and allow the
program to terminate normally. Normal termination, however, is not an
indication that the program is free of deadlocks. It simply means
that the timings of the requests for locks by the threads did not
result in a circular chain in this particular run. Different timings
of the requests in other runs may result in actual deadlock.
The Thread Analyzer tool can be very helpful here as it can detect
potential deadlocks and alert the programmer to their potential
danger. The following run of the program (Figure 6) results in a
potential deadlock. The er_print and tha GUI provide information about
the deadlock.
[Figure 6]
% cc din_philo.c -mt -g
% collect -r deadlock a.out 60
Creating experiment database tha.2.er ...
cattoy-a/deadlock_0005 (316)% collect -r deadlocks a.out 60
Creating experiment database tha.2.er ...
Philosopher 0 sitting down to dinner.
Philosopher 1 sitting down to dinner.
Philosopher 2 sitting down to dinner.
Philosopher 0: get dish 49.
Philosopher 2: get dish 48.
Philosopher 0: got right fork 0
Philosopher 2: got right fork 2
Philosopher 3 sitting down to dinner.
Philosopher 4 sitting down to dinner.
Philosopher 3: get dish 46.
Philosopher 2: got left fork 3
Philosopher 2: eating.
Philosopher 4: get dish 45.
Philosopher 0: got left fork 1
Philosopher 0: eating.
Philosopher 4: got right fork 4
Philosopher 4: got left fork 0
Philosopher 4: eating.
Philosopher 2: get dish 43.
Philosopher 3: got right fork 3
...
Philosopher 2: eating.
Philosopher 3 is done eating.
Philosopher 4: got left fork 0
Philosopher 4: eating.
Philosopher 0 is done eating.
Philosopher 2 is done eating.
Philosopher 4 is done eating.
Philosopher 1: get dish 47.
Philosopher 1: got right fork 1
Philosopher 1: got left fork 2
Philosopher 1: eating.
Philosopher 1 is done eating.
%
Execution terminated normally
% er_print tha.2.er
(er_print) deadlocks
Deadlock #1, Potential deadlock
Thread #2
Lock being held: 0x21590, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215a8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #3
Lock being held: 0x215a8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215c0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #4
Lock being held: 0x215c0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215d8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #5
Lock being held: 0x215d8, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x215f0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Thread #6
Lock being held: 0x215f0, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Lock being requested: 0x21590, at: get_fork + 0x0000002C, line 98 in "din_philo.c"
Deadlocks List Summary: Experiment: tha.2.er Total Deadlocks: 1
(er_print)
|
The following screen-shot shows the potential deadlock displayed
through the tha GUI, using the command:
% tha tha.2.er
Click on the image to see it full sized.
The above discussion presented a multi-threaded program that simulates
the dining philosophers problem. The program has deadlocks. We have
shown how the Thread Analyzer can be used to detect the deadlocks. An
especially useful feature of the Thread Analyzer is that it can detect
potential deadlocks in a program that terminates normally, alerting
the programmer to potential problems in the code.
(Last updated February 23, 2007)