...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
The most common use of Boost.Atomic is to realize custom thread synchronization protocols: The goal is to coordinate accesses of threads to shared variables in order to avoid "conflicts". The programmer must be aware of the fact that compilers, CPUs and the cache hierarchies may generally reorder memory references at will. As a consequence a program such as:
int x = 0, int y = 0; thread1: x = 1; y = 1; thread2 if (y == 1) { assert(x == 1); }
might indeed fail as there is no guarantee that the read of x
by thread2 "sees" the write by thread1.
Boost.Atomic uses a synchronisation concept based on the happens-before relation to describe the guarantees under which situations such as the above one cannot occur.
The remainder of this section will discuss happens-before in a "hands-on" way instead of giving a fully formalized definition. The reader is encouraged to additionally have a look at the discussion of the correctness of a few of the examples afterwards.
As an introductory example to understand how arguing using happens-before works, consider two threads synchronizing using a common mutex:
mutex m; thread1: m.lock(); ... /* A */ m.unlock(); thread2: m.lock(); ... /* B */ m.unlock();
The "lockset-based intuition" would be to argue that A and B cannot be executed concurrently as the code paths require a common lock to be held.
One can however also arrive at the same conclusion using happens-before:
Either thread1 or thread2 will succeed first at m.lock()
.
If this is be thread1, then as a consequence, thread2 cannot succeed at
m.lock()
before thread1 has executed m.unlock()
,
consequently A happens-before B in this case. By symmetry,
if thread2 succeeds at m.lock()
first, we can conclude
B happens-before A.
Since this already exhausts all options, we can conclude that either A happens-before B or B happens-before A must always hold. Obviously cannot state which of the two relationships holds, but either one is sufficient to conclude that A and B cannot conflict.
Compare the spinlock implementation to see how the mutual exclusion concept can be mapped to Boost.Atomic.
The most basic pattern for coordinating threads via Boost.Atomic
uses release
and acquire
on an atomic
variable for coordination: If ...
release
semantic,
acquire
semantic
and
... then A happens-before B.
Consider the following example
atomic<int> a(0); thread1: ... /* A */ a.fetch_add(1, memory_order_release); thread2: int tmp = a.load(memory_order_acquire); if (tmp == 1) { ... /* B */ } else { ... /* C */ }
In this example, two avenues for execution are possible:
store
operation by thread1 precedes the load
by thread2: In this case thread2 will execute B and "A happens-before
B" holds as all of the criteria above are satisfied.
load
operation by thread2 precedes the store
by thread1: In this case, thread2 will execute C, but "A happens-before
C" does not hold: thread2 does not read the
value written by thread1 through a
.
Therefore, A and B cannot conflict, but A and C can conflict.
Ordering constraints are generally specified together with an access to an
atomic variable. It is however also possible to issue "fence" operations
in isolation, in this case the fence operates in conjunction with preceding
(for acquire
, consume
or seq_cst
operations) or succeeding (for release
or seq_cst
) atomic operations.
The example from the previous section could also be written in the following way:
atomic<int> a(0); thread1: ... /* A */ atomic_thread_fence(memory_order_release); a.fetch_add(1, memory_order_relaxed); thread2: int tmp = a.load(memory_order_relaxed); if (tmp == 1) { atomic_thread_fence(memory_order_acquire); ... /* B */ } else { ... /* C */ }
This provides the same ordering guarantees as previously, but elides a (possibly expensive) memory ordering operation in the case C is executed.
The second pattern for coordinating threads via Boost.Atomic
uses release
and consume
on an atomic
variable for coordination: If ...
release
semantic,
consume
semantic
and
... then A happens-before B.
Consider the following example
atomic<int> a(0); complex_data_structure data[2]; thread1: data[1] = ...; /* A */ a.store(1, memory_order_release); thread2: int index = a.load(memory_order_consume); complex_data_structure tmp = data[index]; /* B */
In this example, two avenues for execution are possible:
store
operation by thread1 precedes the load
by thread2: In this case thread2 will read data[1]
and "A happens-before B" holds as all
of the criteria above are satisfied.
load
operation by thread2 precedes the store
by thread1: In this case thread2 will read data[0]
and "A happens-before B" does not
hold: thread2 does not read the value written by thread1 through a
.
Here, the happens-before relationship helps ensure that
any accesses (presumable writes) to data[1]
by thread1
happen before before the accesses (presumably reads) to data[1]
by thread2: Lacking this relationship, thread2 might see stale/inconsistent
data.
Note that in this example, the fact that operation B is computationally dependent on the atomic variable, therefore the following program would be erroneous:
atomic<int> a(0); complex_data_structure data[2]; thread1: data[1] = ...; /* A */ a.store(1, memory_order_release); thread2: int index = a.load(memory_order_consume); complex_data_structure tmp; if (index == 0) tmp = data[0]; else tmp = data[1];
consume
is most commonly (and most safely! see limitations)
used with pointers, compare for example the singleton
with double-checked locking.
The third pattern for coordinating threads via Boost.Atomic
uses seq_cst
for coordination: If ...
seq_cst
,
seq_cst
,
then either "A happens-before D" or "C happens-before B" holds.
In this case it does not matter whether thread1 and thread2 operate on the
same or different atomic variables, or use a "stand-alone" atomic_thread_fence
operation.