Contents
- Introduction
- What is a Data-Race?
- Finding Data-Races Using
DRDT
- Understanding the
Experiment Result
- What to Do after a Data-Race
is Found?
- False Positive
and Benign Data-Races
- User APIs
This document presents a tutorial for using the Sun Studio Data-Race
Detection Tool (DRDT). The tutorial will help you get started with
using DRDT and understanding data-races.
If you want a quick overview of the steps involved in using DRDT, refer
to the DRDT Getting Started Guide.
Throughout this document, we will use two multi-threaded programs to
illustrate data-races and the use of the Sun Studio Data-Race
Detection Tool (DRDT).
The first program (Program 1) is a C program that finds prime numbers,
parallelized using OpenMP. The source file is called
omp_prime.c. There are data-races in the program.
The second program (Program 2) is also a C program that finds prime
numbers, but is parallelized using POSIX threads instead of
OpenMP. The source file is called pthr_prime.c. There are
data-races in the program.
Here is a listing of Program 1 (omp_prime.c):
Program 1:
[Figure 1]
OpenMP Program for Finding Prime Numbers (omp_prime.c)
1 #include <stdio.h>
2 #include <math.h>
3 #include <omp.h>
4
5 #define THREADS 4
6 #define N 3000
7
8 int primes[N];
9 int pflag[N];
10
11 int is_prime(int v)
12 {
13 int i;
14 int bound = floor(sqrt ((double)v)) + 1;
15
16 for (i = 2; i < bound; i++) {
17 /* No need to check against known composites */
18 if (!pflag[i])
19 continue;
20 if (v % i == 0) {
21 pflag[v] = 0;
22 return 0;
23 }
24 }
25 return (v > 1);
26 }
27
28 int main(int argn, char **argv)
29 {
30 int i;
31 int total = 0;
32
33 #ifdef _OPENMP
34 omp_set_num_threads(THREADS);
35 omp_set_dynamic(0);
36 #endif
37
38 for (i = 0; i < N; i++) {
39 pflag[i] = 1;
40 }
41
42 #pragma omp parallel for
43 for (i = 2; i < N; i++) {
44 if ( is_prime(i) ) {
45 primes[total] = i;
46 total++;
47 }
48 }
49 printf("Number of prime numbers between 2 and %d: %d\n",
50 N, total);
51 for (i = 0; i < total; i++) {
52 printf("%d\n", primes[i]);
53 }
54
55 return 0;
56 }
|
Here is a listing of Program 2 (pthr_prime.c):
Program 2:
[Figure 2]
POSIX Thread Program for Finding Prime Numbers (pthr_prime.c)
1 #include <stdio.h>
2 #include <math.h>
3 #include <pthread.h>
4
5 #define THREADS 4
6 #define N 3000
7
8 int primes[N];
9 int pflag[N];
10 int total = 0;
11
12 int is_prime(int v)
13 {
14 int i;
15 int bound = floor(sqrt ((double)v)) + 1;
16
17 for (i = 2; i < bound; i++) {
18 /* No need to check against known composites */
19 if (!pflag[i])
20 continue;
21 if (v % i == 0) {
22 pflag[v] = 0;
23 return 0;
24 }
25 }
26 return (v > 1);
27 }
28
29 void *work(void *arg)
30 {
31 int start;
32 int end;
33 int i;
34
35 start = (N/THREADS) * (*(int *)arg) ;
36 end = start + N/THREADS;
37 for (i = start; i < end; i++) {
38 if ( is_prime(i) ) {
39 primes[total] = i;
40 total++;
41 }
42 }
43 return NULL;
44 }
45
46 int main(int argn, char **argv)
47 {
48 int i;
49 pthread_t tids[THREADS-1];
50
51 for (i = 0; i < N; i++) {
52 pflag[i] = 1;
53 }
54
55 for (i = 0; i < THREADS-1; i++) {
56 pthread_create(&tids[i], NULL, work, (void *)&i);
57 }
58
59 i = THREADS-1;
60 work((void *)&i);
61
62 printf("Number of prime numbers between 2 and %d: %d\n",
63 N, total);
64 for (i = 0; i < total; i++) {
65 printf("%d\n", primes[i]);
66 }
67
68 return 0;
69 }
|
A data-race occurs when all of the following conditions hold:
- Two or more threads in a single process access the same
memory location concurrently, and
- At least one of the accesses is for writing, and
- The threads are not using any exclusive locks to control their
accesses to that memory
When the above three conditions hold, the order of accesses is
non-deterministic, and the computation may give different results from
one run to another, depending on that order. Some data-races may be
benign (for example, when the memory access is used for a busy-wait),
but many data-races are either bugs or caused by bugs in the program..
For example, as a result of data-races in Program 1
(omp_prime.c), different runs of the program may produce
incorrect and inconsistent results as shown below.
Running Program 1:
[Figure 3]
% cc -xopenmp=noopt omp_prime.c -lm .
% a.out | sort -n
0
0
0
0
0
0
0
Number of prime numbers between 2 and 3000: 336
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
67
71
...
2971
2999
% a.out | sort -n
0
0
0
0
0
0
0
0
0
Number of prime numbers between 2 and 3000: 325
3
5
7
13
17
19
23
29
31
41
43
47
61
67
71
73
79
83
89
101
...
2971
2999
|
Similarly, as a result of data-races in Program 2
(pthr_prime.c), different runs of the program may produce
incorrect and inconsistent results as shown below.
Running Program 2:
[Figure 4]
% cc pthr_prime.c -lm -mt .
% a.out | sort -n
Number of prime numbers between 2 and 3000: 304
751
757
761
769
773
787
797
809
811
821
823
827
829
839
853
857
859
863
877
881
...
2999
2999
% a.out | sort -n
Number of prime numbers between 2 and 3000: 314
751
757
761
769
773
787
797
809
811
821
823
827
839
853
859
877
881
883
907
911
...
2999
2999
|
The Data Race Detection Tools (DRDT) follows the same
"collect-analyze" model that the Sun Studio Analyzer uses.
There are three steps involved in using DRDT:
- Instrument the source code
- Create a data-race-detection experiment
- Examine the data-race-detection experiment
These three steps are described below.
3.1 Instrument the source code
In order to enable data-race detection in a program, the program must
first be instrumented. You can instrument a program by compiling the
source file(s) with a special compiler option. This special option for
each of C, C++, and F90 is: -xinstrument=datarace
Add the -xinstrument=datarace compiler option to the existing
set of options you use to compile your program. You can apply the
option to only the source files that you suspect to have data-races.
Note:
It is recommended that you use no optimization level when compiling
your program for data-race detection. Compile an OpenMP program with
-xopenmp=noopt. The information reported, such as line numbers and
callstacks, may be incorrect when a high optimization level is used.
3.2 Create a Data-Race-Detection Experiment
Use the collect command with the -r on flag to run
the program and create a data-race-detection experiment during the
execution of the process. If running an OpenMP program, make sure
that the number of threads used is larger than 1.
Note:
To increase the likelihood of detecting data-races, it is recommended
that you create several data-race-detection experiments using
collect with the -r on flag. Use a different number
of threads and different input data in the different experiments.
3.3 Examine the Data-Race-Detection Experiment
A data-race-detection experiment can be examined with the rdt
command, the analyzer command, or the er_print
utility.
Both the rdt and the analyzer commands present a GUI
interface; the former presents a simplified set of default tabs, but
is otherwise identical to the analyzer.
The rdt GUI has a menu bar, a tool bar, and a split pane that
contains tabs for the various displays. On the left-hand pane, the
following three tabs are shown by default:
- The Races tab. Shows a list of data-races detected in the
program. This tab is selected by default.
- The Race Source tab. Shows the two source locations corresponding
to the two accesses of a selected data-race. The source line where a
data-race access occurred will appear highlighted.
- The Experiments tab. Shows the load objects in the
experiment, and lists error and warning messages.
On the right-hand pane of the rdt display, the following
two tabs are shown:
- The Summary tab. Shows summary information about a data-race
access selected from the Races tab.
- The Race Details tab. Shows detailed information about a
data-race trace selected from the Races tab.
The er_print utility, on the other hand, presents a
command-line interface. The subcommands that are generally useful for
examining races with er_print are:
races
Report any data-races detected in the experiment.
rdetail race_id
Display detailed information about the data-race with the specified
race_id. If the specified race_id is
all, then detailed information about all data-races will be
displayed.
header
Display descriptive information about the experiment, and report any
errors and warnings.
Refer to the collect.1, rdt.1, analyzer.1,
and er_print.1 man pages for more information.
Note:
The data-races detected in a program depend on the order of memory
accesses by the different threads, the number of threads used, and the
input data used, among other things. Therefore, different runs of the
program may reveal different data races than the ones shown below.
The following examples show the steps involved in instrumenting
Program 1 and Program 2, creating a data-race-detection experiment,
and using the er_print or the rdt GUI to display the
data-races detected.
When displaying the data-races detected, you will see the
following information about each data-race:
- A unique id that identifies the data-race.
- The virtual address (Vaddr) associated with the data-race. If
there is more than one virtual address, then (Multiple
Addresses) will appear.
- The two accesses by two different threads that constitute the
data-race. For each access, the type of the access (Read or Write) is
shown, as well as the function, offset, and line number in the source
code where the access occurred.
- The total number of traces associated with the data-race.
Each trace refers to the pair of thread callstacks at the time the two
data-race accesses occurred. If using the rdt GUI, the two
callstacks will be displayed in in the Race Details tab when an
individual trace is selected. If using the er_print utility,
the two callstacks will be displayed by the rdetail
command.
4.1 Using DRDT with Program 1
[Figure 5]
% cc -xopenmp=noopt omp_prime.c -lm -xinstrument=datarace .
% collect -r on a.out | sort -n
0
0
0
0
0
0
0
0
0
0
...
0
0
Creating experiment database test.1.er ...
Number of prime numbers between 2 and 3000: 429
2
3
5
7
11
13
17
19
23
29
31
37
41
47
53
59
61
67
71
73
...
2971
2999
% er_print test.1.er
(er_print) races
Total Races: 4 Experiment: test.1.er
Race #1, Vaddr: 0xffbfeec4
Access 1: Read, main -- MP doall from line 42 [_$d1A42.main] + 0x00000060,
line 45 in "omp_prime.c"
Access 2: Write, main -- MP doall from line 42 [_$d1A42.main] + 0x0000008C,
line 46 in "omp_prime.c"
Total Traces: 2
Race #2, Vaddr: 0xffbfeec4
Access 1: Write, main -- MP doall from line 42 [_$d1A42.main] + 0x0000008C,
line 46 in "omp_prime.c"
Access 2: Write, main -- MP doall from line 42 [_$d1A42.main] + 0x0000008C,
line 46 in "omp_prime.c"
Total Traces: 1
Race #3, Vaddr: (Multiple Addresses)
Access 1: Write, main -- MP doall from line 42 [_$d1A42.main] + 0x0000007C,
line 45 in "omp_prime.c"
Access 2: Write, main -- MP doall from line 42 [_$d1A42.main] + 0x0000007C,
line 45 in "omp_prime.c"
Total Traces: 1
Race #4, Vaddr: 0x21418
Access 1: Read, is_prime + 0x00000074,
line 18 in "omp_prime.c"
Access 2: Write, is_prime + 0x00000114,
line 21 in "omp_prime.c"
Total Traces: 1
(er_print)
|
The following screen-shot shows the races detected in
omp_primes.c displayed through the rdt GUI, using
the command:
% rdt test.1.er

Looking at the data-races reported for Program 1, we find there
are four data-races in omp_prime.c:
- Race #1: A data-race between a Read from total on line
45 and a Write to total on line 46
- Race #2: A data-race between a Write to total on line 46
and another Write to total on the same line
- Race #3: A data-race between a Write to primes[] on line
45 and another Write to primes[] on the same line
- Race #4: A data-race between a Read from pflag[] on line
18 and a Write to pflag[] on line 21
4.2 Using DRDT with Program 2
[Figure 6]
% cc pthr_prime.c -lm -mt -xinstrument=datarace .
% collect -r on a.out | sort -n
Creating experiment database test.2.er ...
of type "nfs", which may distort the measured performance.
0
0
0
0
0
0
0
0
0
0
...
0
0
Creating experiment database test.2.er ...
Number of prime numbers between 2 and 3000: 328
751
757
761
773
797
809
811
821
823
827
829
839
853
857
859
877
881
883
887
907
...
2999
2999
% er_print test.2.er
(er_print) races
Total Races: 6 Experiment: test.2.er
Race #1, Vaddr: 0x218d0
Access 1: Write, work + 0x00000154,
line 40 in "pthr_prime.c"
Access 2: Write, work + 0x00000154,
line 40 in "pthr_prime.c"
Total Traces: 3
Race #2, Vaddr: 0x218d0
Access 1: Read, work + 0x000000CC,
line 39 in "pthr_prime.c"
Access 2: Write, work + 0x00000154,
line 40 in "pthr_prime.c"
Total Traces: 3
Race #3, Vaddr: 0xffbfeec4
Access 1: Write, main + 0x00000204,
line 55 in "pthr_prime.c"
Access 2: Read, work + 0x00000024,
line 35 in "pthr_prime.c"
Total Traces: 2
Race #4, Vaddr: (Multiple Addresses)
Access 1: Write, work + 0x00000108,
line 39 in "pthr_prime.c"
Access 2: Write, work + 0x00000108,
line 39 in "pthr_prime.c"
Total Traces: 1
Race #5, Vaddr: 0x23bfc
Access 1: Write, is_prime + 0x00000210,
line 22 in "pthr_prime.c"
Access 2: Write, is_prime + 0x00000210,
line 22 in "pthr_prime.c"
Total Traces: 1
Race #6, Vaddr: 0x247bc
Access 1: Write, work + 0x00000108,
line 39 in "pthr_prime.c"
Access 2: Read, main + 0x00000394,
line 65 in "pthr_prime.c"
Total Traces: 1
(er_print)
|
The following screen-shot shows the races detected in
omp_primes.c displayed through the rdt GUI, using
the command:
% rdt test.2.er
Looking at the data-races reported for Program 2, we find there
are six data-races in pthr_prime.c:
- Race #1: A data-race between a Write to total on line 40
and another Write to total on the same line
- Race #2: A data-race between a Read from total on line
39 and a Write to total on line 40
- Race #3: A data-race between a Write to i on line 55 and
a Read from i on line 35
- Race #4: A data-race between a Write to primes[] on line
39 and another Write to primes[] the same line
- Race #5: A data-race between a Write to pflag[] on line
22 and another Write to pflag[] on the same line
- Race #6: A data-race between a Write to primes[] on line
39 and a Read from primes[] on line 65
One advantage of the rdt GUI is that it allows you to see,
side by side, the two source locations associated with a
data-race. For example, if you first select Race #6 reported for
Program 2 in the Races tab and then click on the Race Source tab, you
will see the following:
In the above screen-shot, the first access for Race #6 (line 39) is
shown in the the top source display, while the second access for that
data-race is shown in the bottom source display. The source lines, 39
and 65, where the data-race accesses occurred appear highlighted. To
the left of each source line, the default metric, exclusive number of
data-race accesses, is shown. The number shown is a lower bound of the
number of data-race accesses that occurred at that line
5.1 Check Whether the Data-Race is a False Positive
A false positive data-race is a data-race that is reported by the
Data-Race Detection tool (DRDT), but has actually not occurred. DRDT
tries to reduce the number of false positives reported. However,
there are cases where the tool is not able to do a precise job and may
report false positive data-races.
Since a false positive data-race is simply a false alarm, and is
not really a data-race, it does not affect the behavior of the program
and can therefore be ignored.
See Section 6.1 (False Positive Data-Races) for some examples of false positive
data-races. Also see Section 7 (User APIs), for information on how
to remove false positive data-races from the report.
5.2 Check Whether the Data-Race is Benign
A benign data-race is an intentional data-race whose existence
does not affect the correctness of the program.
Some multi-threaded applications intentionally use code that may
cause data-races. Since the data-races are there by design, no fix is
required. In many cases, however, it is quite tricky to get such codes
to run correctly. These data-races should be reviewed carefully.
See Section 6.2 (Benign Data-Races) for more detailed information
about benign races.
5.3 Fix the Bug, Not the Data Race
The Data-Race Detection Tool can help find data-races in the program,
but it cannot automatically find bugs in the program. While a
data-race may be the root cause of a bug, it may also have been
introduced by a bug. It is important to find and fix the bug. Merely
removing the data-race is not the right approach, and could make
further debugging even more difficult. Fix the bug, not the data-race.
Fixing Bugs in Program 1:
To remove the data-race in omp_prime.c between the Read from
total on line 45 and the Write to total on line 46,
we could put each of lines 45 and 46 in a critical section, as
follows:
[Figure 7]
Incorrect fix
42 #pragma omp parallel for .
43 for (i = 2; i < N; i++) {
44 if ( is_prime(i) ) {
#pragma omp critical
{
45 primes[total] = i;
}
#pragma omp critical
{
46 total++;
}
47 }
48 }
|
The critical sections around each of lines 45 and 46 get rid of the
data-race between lines 46 and 45, because the third condition for a
data-race, namely that the threads are not using any exclusive locks
to control their accesses to total, is not satisfied -- see
Section 2 (What is a Data-Race).
The critical section around line 46 ensures that the computed value of
total is correct. However, the program is still incorrect.
Two threads may update the same element of primes[] using the
same total value. Moreover, some elements in
primes[] may not be assigned a value at all.
One correct fix is to put both lines 45 and 46 in a single critical
section that protects the two lines and prevents the data-race, as
follows:
[Figure 8]
A correct fix
42 #pragma omp parallel for .
43 for (i = 2; i < N; i++) {
44 if ( is_prime(i) ) {
#pragma omp critical
{
45 primes[total] = i;
46 total++;
}
47 }
48 }
|
Note that the single critical section shown above also fixes two other
data races in omp_prime.c: The data-race on prime[]
at line 45, as well as the data-race on total at line 46.
The fourth data-race reported, between a Read from pflag[] on
line 18 and a Write to pflag[] on line 21, is actually a
benign race as it does not lead to incorrect results. Therefore, it
is not essential to fix this data-race. Refer to Section 6.2.1 (First
Example of a Benign Data-Races) for more information.
Fixing Bugs in Program 2:
To remove the data-race in pthr_prime.c between the Read from
total on line 39 and the Write to total on line 40,
a single mutex can be used to protect these two lines. This also
fixes two other data races in pthr_prime.c: The data-race on
prime[] at line 39, as well as the data-race on
total at line 40.
The two data races, (a) data-race between the Write to i on
line 55 and the Read from i on line 35, and (b) data-race on
pflag[] on line 22, reveal a problem in the shared-access to
variable i by different threads. The initial thread in
pthr_prime.c creates the child threads in a loop (source
lines 55-57), and dispatches them to work on the function
work(). The loop index i is passed to
work() by address. Since all threads access the same
memory location for i, the value of i for each
thread will not remain unique, but will change as the initial thread
increments the loop index. As different threads use the same value of
i, the aforementioned data-races occur. One way to fix the
problem is to pass i to work() by value. This
ensures that each thread will have its own private copy of i
with a unique value.
To remove the data-race on primes[] between the Write access
on line 39 and the Read access on line 65, we can protect line 65 with
the same mutex lock as the one used above for lines 39 and 40.
However, this is not the right fix. The real problem is that the main
thread may report the result (lines 50 through 53) while the child
threads are still updating total and primes[] in
function work(). Using mutex locks does not provide the
proper ordering synchronization between the threads. One correct fix
is to let the main thread wait for all child threads to join it before
printing out the results.
The following is a fixed version of Program 2 (pthr_prime.c).
[Figure 9]
A fixed version of pthr_prime.c
1 #include <stdio.h>
2 #include <math.h>
3 #include <pthread.h>
4
5 #define THREADS 4
6 #define N 3000
7
8 int primes[N];
9 int pflag[N];
10 int total = 0;
11 pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
12
13 int is_prime(int v)
14 {
15 int i;
16 int bound = floor(sqrt(v)) + 1;
17
18 for (i = 2; i < bound; i++) {
19 /* no need to check against known composites */
20 if (!pflag[i])
21 continue;
22 if (v % i == 0) {
23 pflag[v] = 0;
24 return 0;
25 }
26 }
27 return (v > 1);
28 }
29
30 void *work(void *arg)
31 {
32 int start;
33 int end;
34 int i;
35
36 start = (N/THREADS) * ((int)arg) ;
37 end = start + N/THREADS;
38 for (i = start; i < end; i++) {
39 if ( is_prime(i) ) {
40 pthread_mutex_lock(&mutex);
41 primes[total] = i;
42 total++;
43 pthread_mutex_unlock(&mutex);
44 }
45 }
46 return NULL;
47 }
48
49 int main(int argn, char **argv)
50 {
51 int i;
52 pthread_t tids[THREADS-1];
53
54 for (i = 0; i < N; i++) {
55 pflag[i] = 1;
56 }
57
58 for (i = 0; i < THREADS-1; i++) {
59 pthread_create(&tids[i], NULL, work, (void *)i);
60 }
61
62 i = THREADS-1;
63 work((void *)i);
64
65 for (i = 0; i < THREADS-1; i++) {
66 pthread_join(tids[i], NULL);
67 }
68
69 printf("Number of prime numbers between 2 and %d: %d\n",
70 N, total);
71 for (i = 0; i < total; i++) {
72 printf("%d\n", primes[i]);
73 }
74 }
|
5.4 Run the Data-Race Detection Tool Again
After fixing the bugs in the program revealed by the detected
data-races, the updated program should be tested again using the
Data-Race Detection Tool. The bug fixes may change the behavior of the
program. The changes may expose data-races that were previously
hidden, or may introduce new data-races. Therefore, the updated
program should be re-tested.
The following diagram shows the recommended flow of using the
Data-Race Detection Tool.
[Figure 10]
START
|
V
--> Instrument the program for data-race detection
| |
| |
| V
| L1: Perform a data-race detection experiment: Repeat the same setup, or
| | Use different input data, or
| | Use a different number of threads, or
| | Use different loop schedules, or
| | Use a different machine
| |
| V
| New data-races reported? -- NO --> Confident about the result? --NO--> GOTO L1
| | |
| YES YES
| | |
| | |----> DONE
| V
| Is the data-race a false positive? <---------------------
| | | ^
| NO YES |
| | | |
| | |----> Ignore it, or try to |
| | remove it using the |
| | user APIs. GOTO L3. |
| V |
| L2: Is the data-race benign? |
| | | |
| NO YES |
| | | |
| | | |
| V |----> Ignore it. GOTO L3 |
| Fix the bug |
| | |
| | |
| V |
| L3: Any more races to examine? --- YES ------------------>
| |
| NO
| |
|<---------------V
|
6.1 False Positive Data-Races
Occasionally, the Data-Race Detection Tool may report data-races that
are false and that actually have not occurred in the program. These
are called false positives.
In most cases, false positives are caused by one of the following two
reasons:
6.1.1 Use of User-Defined Synchronizations
The Data-Race Detection Tool can recognize most standard
synchronization APIs and constructs provided by OpenMP, POSIX threads,
and Solaris threads. However, the tool cannot recognize roll-your-own
style synchronizations, and may report false data-races if such
synchronizations are used. For example, the tool cannot recognize
implementation of locks using CAS instructions, post and wait
operations using busy-waits, etc.
Here is a typical example of a class of false positives where the
program employs a common idiom of using POSIX thread condition
variables.
[Figure 11]
/* Initially ready_flag is 0 */
/* Thread 1: Producer */
100 data = ...
101 pthread_mutex_lock (&mutex);
102 ready_flag = 1;
103 pthread_cond_signal (&cond);
103 pthread_mutex_unlock (&mutex);
...
/* Thread 2: Consumer */
200 pthread_mutex_lock (&mutex);
201 while (!ready_flag) {
202 pthread_cond_wait (&cond, &mutex);
203 }
204 pthread_mutex_unlock (&mutex);
205 ... = data;
|
To protect against program errors and spurious wake-ups, the
pthread_cond_wait() call is usually made within a loop that
tests the predicate. The test and set of the predicate is often
protected by a mutex lock. In the above code, thread 1 produces the
value for variable data at line 100, sets the
ready_flag to 1 at line 102 to indicate that the data has
been produced, and then calls pthread_cond_signal() to wake
up the consumer thread 2. Thread 2 tests the predicate
(!ready_flag) in a loop. When it finds that the flag is set,
it consumes the data at line 205.
The Write of ready_flag at line 102 and Read of
ready_flag at line 201 are protected by the same mutex lock,
so there is no data-race between the two accesses and the tool
recognizes that correctly.
The Write of data at line 100 and the Read of data
at line 205 are not protected by mutex locks. However, in the program
logic, the Read at line 205 always happens after the Write at line 100
because of the flag variable ready_flag. So there is no
data-race between these two accesses to data. However, the
tool will report that there is a data-race between the two accesses if
the call to pthread_cond_wait() (line 202) is actually not
called at run time. If line 102 is executed before line 201 is ever
executed, then when line 201 is executed, the loop entry test fails
and line 202 is skipped. The tool monitors
pthread_cond_signal() calls and pthread_cond_wait()
calls and can pair them to derive synchronization. When the
pthread_cond_wait() at line 202 is not called, the tool does
not know that the Write at line 100 is always executed before the Read
at line 205. Therefore, it considers them as executed concurrently and
reports a data-race between them.
In order to avoid reporting this kind of false positive data-race, the
Data-Race Detection Tool provides a set of APIs that can be used to
notify the tool when user-defined synchronizations are performed. See
Section 7 (User APIs) for details.
6.1.2 Memory Recycled by Different Threads
Some memory management routines will recycle memory freed by one
thread for the use by another thread. The Data-Race Detection Tool is
sometimes not able to recognize that the life-times of the same memory
location used by different threads do not overlap. When this happens,
the tool may report a false positive data-race. The following example
illustrates this kind of false positive.
[Figure 12]
/*----------*/ /*----------*/
/* Thread 1 */ /* Thread 2 */
/*----------*/ /*----------*/
ptr1 = mymalloc(sizeof(data_t));
ptr1->data = ...
...
myfree(ptr1);
ptr2 = mymalloc(sizeof(data_t));
ptr2->data = ...
...
myfree(ptr2);
|
Thread 1 and thread 2 execute concurrently. Each thread allocates a
chunk of memory that is used as its private memory. The routine
mymalloc() may supply the memory freed by a previous
myfree() call. If thread 2 calls mymalloc() before
thread 1 calls myfree(), then ptr1 and ptr2
will get different values and there is no data-race between the two
threads. However, if thread 2 calls mymalloc() after thread 1
calls myfree(), then ptr1 and ptr2 may have
the same value. There is no data-race because thread 1 will no longer
access that memory. However, if the tool does not know
mymalloc() is recycling memory, it will report a data-race
between the Write of ptr1->data and the Write of
ptr2->data.
This kind of false positive often happens in C++ applications when
the C++ runtime library recycles memory for temporary variables. It
also often happens in user applications that implement their own
memory management routines.
Currently, the Data-Race Detection Tool is able to recognize memory
allocation and free operations performed via the standard
malloc(), calloc(), and realloc()
interfaces.
6.2 Benign Data-Races
Some multi-threaded applications intentionally allow data-races in
order to get better performance. A benign data-race is an intentional
data-race whose existence does not affect the correctness of the
program. Two examples are given below.
6.2.1 First Example of a Benign Data-Race
In Program 1 (omp_prime.c), the threads check whether an
integer is prime by executing the function is_prime():
[Figure 13]
11 int is_prime(int v)
12 {
13 int i;
14 int bound = floor(sqrt ((double)v)) + 1;
15
16 for (i = 2; i < bound; i++) {
17 /* No need to check against known composites */
18 if (!pflag[i])
19 continue;
20 if (v % i == 0) {
21 pflag[v] = 0;
22 return 0;
23 }
24 }
25 return (v > 1);
26 }
|
The Data-Race Detection Tool reports that there is a data-race between
the Write to pflag[] on line 21 and the Read of
pflag[] on line 18. However, this data-race is benign as it
does not affect the correctness of the final result. At line 18, a
thread checks whether pflag[i], for a given value of
i is equal to 0. If pflag[i] is equal to 0, then
the thread continues on to the next value of i. If pflag[]
is not equal to 0 and v is divisible by i, then the
thread writes the value 0 to pflag[i].
It does not matter if, from a correctness point of view,
multiple threads check the same pflag[i] and write to it
concurrently, since the only value that is written to
pflag[i] is 0.
6.2.2 Second Example of a Benign Data-Race
A group of threads call check_bad_array() concurrently to
check whether any element of array data_array is "bad". Each
thread checks a different section of the array. If a thread finds that
an element is "bad", it sets the value of a global shared variable
is_bad to true.
[Figure 14]
20 volatile int is_bad = 0;
...
100 /*
102 * Each thread checks its assigned portion of data_array, and sets
102 * the global flag is_bad to 1 once it finds a bad data element.
103 */
104 void check_bad_array(volatile data_t *data_array, unsigned int thread_id)
105 {
106 int i;
107 for (i=my_start(thread_id); i<my_start(thread_id); i++) {
108 if (is_bad)
109 return;
110 else {
111 if (is_bad_element(data_array[i])) {
112 is_bad = 1;
113 return;
114 }
115 }
116 }
117 }
|
There is a data-race between the Read of is_bad on line 108
and the Write of is_bad on line 112. However, the data-race
does not affect the correctness of the final result.
6.2.3 Third Example of a Benign Data-Race
Double-checked-locking is a program idiom that provides an efficient
way to initialize a singleton in multi-threaded applications. The
following code illustrates such an implementation.
[Figure 15]
100 class Singleton {
101 public:
102 static Singleton* instance();
103 ...
104 private:
105 static Singleton* ptr_instance;
106 };
...
200 Singleton* Singleton::ptr_instance = 0;
...
300 Singleton* Singleton::instance() {
301 Singleton *tmp = ptr_instance;
302 memory_barrier();
303 if (tmp == NULL) {
304 Lock();
305 if (ptr_instance == NULL) {
306 tmp = new Singleton;
307 memory_barrier();
308 ptr_instance = tmp;
309 }
310 Unlock();
311 }
312 return tmp;
313 }
|
The Read of ptr_instance (line 301) is intentionally not
protected by a lock to make the checking of whether the singleton has
already been instantiated efficient in a multi-threaded
environment. Notice that there is a data-race on variable
ptr_instance between the Read on line 301 and the Write on
line 308, but the program works correctly.
Writing a correct program that allows data-races is often a tricky
task. For example, in the above double-checked-locking example, the
calls to memory_barrier() at lines 302 and 307 are used to ensure that
the singleton and
ptr_instance is set and read in the proper order, so all threads read them
consistently. This program idiom is broken if the memory barriers are
not used.
Finally, in addition to the above examples of benign data-races, a
large class of applications that allow data-races are those that use
lock-free/wait-free algorithms which are difficult to design
correctly. The Data-Race Detection Tool can help pin-point the
locations of data-races in these applications.
The Data-Race Detection Tool can recognize most standard
synchronization APIs and constructs provided by OpenMP, POSIX threads,
and Solaris threads. However, the tool cannot recognize roll-your-own
style synchronizations, and may report false data-races if such
synchronizations are used. For example, the tool cannot recognize
spin locking implemented using hand-coded assembly.
If the program includes roll-your-own style synchronizations, then a
set of user API calls can be inserted in the program to inform the
Data-Race Detection Tool of those synchronizations. This allows the
tool to recognize the synchronizations, thus reducing the number of
false positives reported.
| rdt_notify_acquire_lock(id) |
Insert immediately after a user-defined lock has been acquired
successfully |
| rdt_notify_acquire_write_lock(id) |
Insert immediately after a user-defined read-write lock has been
acquired successfully in Write mode |
| rdt_notify_acquire_read_lock(id) |
Insert immediately after a user-defined read-write lock has been
acquired successfully in Read mode |
| rdt_notify_release_lock(id) |
Insert immediately after a user-defined lock (including a
read-write lock) has been released successfully |
| rdt_notify_sync_post_begin(id) |
Insert immediately before a user-defined post synchronization is
performed |
| rdt_notify_sync_post_end(id) |
Insert immediately after a user-defined post synchronization has
been performed |
| rdt_notify_sync_wait_begin(id) |
Insert immediately before a user-defined wait synchronization is
performed |
| rdt_notify_sync_wait_end(id) |
Insert immediately after a user-defined wait synchronization has
been performed |
A C/C++ version and a Fortran version of the APIs are provided. Each
API call takes a single argument id, whose value should
uniquely identify the synchronization object.
In the C/C++ version of the APIs, the type of the argument is
uintptr, which is 4 bytes long in 32-bit mode and 8 bytes
long in 64-bit mode. You need to add #include
<rdt_hooks.h> to your C/C++ source file when calling
any of the APIs.
In the Fortran version of the APIs, the type of the argument is
integer of kind rdt_sobj_kind which is 8-byte long in both
32-bit mode and 64-bit mode. You need to add include
"rdt_fhooks.h" to your Fortran source file when calling
any of the APIs.
To uniquely identify a synchronization object, the argument
id should have a different value for each different
synchronization object. One way to do this is to let the value of
id be the address of the synchronization object.
The following code shows how to use the API to remove the false
positive in Figure 11.
[Figure 16]
# include <rdt_hooks.h>
...
/* Initially ready_flag is 0 */
...
/* Thread 1: Producer */
100 data = ...
101 pthread_mutex_lock (&mutex);
rdt_notify_sync_post_begin ((uintptr) &ready_flag);
102 ready_flag = 1;
rdt_notify_sync_post_end ((uintptr) &ready_flag);
103 pthread_cond_signal (&cond);
103 pthread_mutex_unlock (&mutex);
/* Thread 2: Consumer */
200 pthread_mutex_lock (&mutex);
rdt_notify_sync_wait_begin ((uintptr) &ready_flag);
201 while (!ready_flag) {
202 pthread_cond_wait (&cond, &mutex);
203 }
rdt_notify_sync_wait_end ((uintptr) &ready_flag);
204 pthread_mutex_unlock (&mutex);
205 ... = data;
|
Refer to the librdthooks.3 man page for more information
about using the User APIs.
(Updated June 26, 2006)
|
|