Sun Java Solaris Communities My SDN Account Join SDN
 
Article

Detecting and Avoiding OpenMP Race Conditions in C++

 
By Phyllis Gustafson, Sun ONE Studio Tools Group, Sun Microsystems  

While OpenMP is now the preferred model for writing portable shared-memory parallel programs, there are no safeguards in either the OpenMP specifications or its implementation to protect you from inadvertently introducing race conditions into your code. A race condition exists when two unsynchronized threads access the same shared variable with at least one thread modifying the variable. The outcome may be unpredictable and depends on the timing of the threads in the team. Race conditions are an insidious problem because they can remain undetected for many thousands of executions, and it is not always obvious that the program has generated incorrect results. Because communications and synchronizations are often implicit in shared memory programming, race conditions can arise unexpectedly. It is the programmer's responsibility to ensure that the code is free from situations that could give rise to race conditions that corrupt the computational results. This article discusses some common scenarios that cause race conditions and provides easy coding alternatives to avoid them.

How Shared Variables Can Cause Race Conditions

Almost all race conditions that arise in OpenMP programs can be traced to the use of shared variables. The most important rule to remember when you use OpenMP shared variables is to enclose operations on them in a critical region or to attach an atomic directive. The following scenarios show what can happen when this is not done.

  • Simple Increments

    It is easy to overlook very simple accesses such as:

            int i=0;
            #pragma omp parallel
            {
                :
                i++;
                :
            }
    

    While you might think that nothing can go wrong since each thread merely increments the variable, the fact that the operation is not atomic can cause unexpected results. Consider a possible time-line for a two-thread example.

    Timeline
    Clock
    Thread 0
    Thread 1
    1
    load i (i = 0)
    2
    incr i (i = 1)
    3
    swapped out
    load i (i = 0)
    4
    incr i (i = 1)
    5
    store i (i = 1)
    6
    store i (i = 1)
    swapped out
     

    In this case, the result in i is 1 and not 2, as you would expect. Because the increment (++) operation is not atomic, it can be interrupted before completion and cause incorrect results. A simple increment on a shared variable like this is a prime candidate for the use of the OpenMP atomic directive, which eliminates the possibility of a race condition.

            #pragma omp atomic
                i++;
    
  • Loop indices

    The following example demonstrates another race condition situation that may not be immediately obvious:

            int i, j;
            #pragma omp parallel for
            for (i = 0; i < N; i++)
                for (j = 0; j < M; j++)
                {
                    a[i][j] = get_val(i, j);
                }
    

    The OpenMP specifications require that index variables used to control OpenMP for loops be private to each thread. This requirement, however, does not extend to all loop index variables within the parallel region. In the above example, the index variable i is private, but the index variable j is shared. That is, each thread is operating upon j independently and j's value at any given time depends on which thread effected the most recent change and in what iteration of the inner loop that thread resided.

    Consider the possible time line for a two-thread execution of this code where thread 0 is in iteration i = 0 and thread 1 is in iteration i = N/2.

    Timeline
    Clock
    Thread 0
    Thread 1
    1
    load i
    2
    call get_val(i)
    3
    swapped out
    load i
    4
    call get_val(i)
    5
    load j (j = 0)
    6
    store a[i][j] ([N/2][0])
    7
    incr j (j = 1)
    8
    store j
    9
    load j (j = 1)
    swapped out
    10
    store a[i][j] ([0][1])
     

    Here the value intended for a[0][0] is actually stored into a[0][1] because thread 1 changed the value of j before thread 0 was able to store its result. To avoid this type of data corruption, make sure that all loop indices are private within a parallel region. The above example might be changed to:

            #pragma omp parallel for private(i, j)
            for (i = 0; i < N; i++)
                for (j = 0; j < M; j++)
                {
                    a[i][j] = get_val(i, j);
                }
    

    or to this:

            #pragma omp parallel for
            for (i = 0; i < N; i++)
            {
                for (int j = 0; j < M; j++)
                {
                    a[i][j] = get_val(i, j);
                }
            }
    
  • Shared class objects and methods

    It is usually the case that class objects have methods that operate upon them, and when such objects are used as shared variables within OpenMP parallel regions, race conditions can result.

    Consider, for example:

     
    	class A { 
    	public: 
    	    int x; 
    	    A(int i = 0){ x = i; }; 
    	    void mul(int y){ x *= y; } 
    	}; 
    	main() 
    	{ 
    	    A a(5); 
    	    #pragma omp parallel 
    	    { 
    		a.mul(2); 
    	    } 
    	} 
    

    For four threads the final value of a.x should be 80, but because of race conditions the value is often 40.

    When you invoke a method on a shared class object within a parallel region there are two ways to avoid data corruption through race conditions:

    • Guard the invocation with a critical region
    • Guard the actual update code in the method with a critical region

    In the first case, the code should be changed to:

                #pragma omp parallel
    	    {
    		#pragma omp critical
    		    a.mul(2);
    	    }
    

    In the second case, the method should be changed to:

                void mul(int y){
    	  	#pragma omp critical
                        x *= y;
    	    }
    

Conclusion

The above is by no means an exhaustive listing of the causes of race conditions in OpenMP programs, but it does review the most common simple situations that give rise to problems. The following two-step process goes a long way toward eliminating race conditions from your code:

  1. Identify all shared variables within an OpenMP region
  2. Guard all modifications of those variables with critical regions or atomic directives, even when they look innocuous

Remember, that even though it is easy to write shared memory programs, it is not easy to write correct shared memory programs.

Further Reading

  1. Rohit Chandra, etal, "Parallel Programming in OpenMP", Morgan Kaufmann, 2001
  2. "OpenMP C and C++ Application Program Interface", OpenMP Architecture Review Board, 2002

About the Author

Phyllis Gustafson is a staff engineer at Sun Microsystems and team lead for the C++ OpenMP project. She is a member of the OpenMP Architecture Review Board's working group that documents the OpenMP Application Program Interface.