Race
Conditions
A race
condition is where the behavior of code depends on the interleaving
of multiple threads. This is perhaps the most fundamental problem with
multi-threaded programming.
When analyzing or writing single-threaded code we only have to
think about the sequence of statements right in front of us; we can assume that
data will not magically change between statements. However, with improperly
written multi-threaded code non-local data can change unexpectedly due to the
actions of another thread.
Race conditions can result in a high-level logical fault in your
program, or (more excitingly) it may even pierce C++'s statement-level
abstraction. That is, we cannot even assume that single C++ statements execute
atomically because they may compile to multiple assembly instructions. In
short, this means that we cannot guarantee the outcome of a statement such as
foo +=
1;
if foo
is
non-local and may be accessed from multiple threads.
A contrived example follows.
Listing 1. A logical race condition
int sharedCounter = 50;
void* workerThread(void*)
{
while(sharedCounter > 0)
{
doSomeWork();
--sharedCounter;
}
}
Now imagine that we start a number of threads, all executing workerThread()
. If we
have just one thread, doSomeWork()
is going
to be executed the correct number of times (whatever sharedCounter
starts
out at).
However, with more than one thread doSomeWork()
will
most likely be executed too many times. Exactly how many times depends on the
number of threads spawned, computer architecture, operating system scheduling
and... chance. The problem arises because we do not test and update sharedCounter
as an
atomic operation, so there is a period where the value of sharedCounter
is
incorrect. During this time other threads can pass the test when they really
shouldn't have.
The value of sharedCounter
on exit tells us how many extra times doSomeWork()
is
called. With a single thread, the final value of sharedCounter
is of
course 0. With multiple threads running, it will be between 0 and -N where N is
the number of threads.
Moving the update adjacent to the test will not make these two
operations atomic. The window during which sharedCounter
is out of date will be smaller,
but the race condition remains. An illustration of this non-solution follows:
Listing 2. Still a race condition
void* workerThread(void*)
{
while(sharedCounter > 0)
{
--sharedCounter;
doSomeWork();
}
}
The
solution is to use a mutex
to synchronize the threads with respect to the test and update.
Another way of saying this is that we need to define a critical section in
which we both test and update the sharedCounter
. The next section introduces mutex and solves
the example race condition.
See also: