Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for an old version of boost. Click here for the latest Boost documentation.
C++ Boost

Guaranteeing Alignment

Terminology

Review the concepts document if you are not already familiar with it. Remember that block is a contiguous section of memory, which is partitioned or segregated into fixed-size chunks. These chunks are what are allocated and deallocated by the user.

Overview

Each Pool has a single free list that can extend over a number of memory blocks. Thus, Pool also has a linked list of allocated memory blocks. Each memory block, by default, is allocated using new[], and all memory blocks are freed on destruction. It is the use of new[] that allows us to guarantee alignment.

Proof of Concept: Guaranteeing Alignment

Each block of memory is allocated as a POD type (specifically, an array of characters) through operator new[]. Let POD_size be the number of characters allocated.

Predicate 1: Arrays may not have padding

This follows from the following quote:

[5.3.3/2] (Expressions::Unary expressions::Sizeof) "... When applied to an array, the result is the total number of bytes in the array. This implies that the size of an array of n elements is n times the size of an element."

Therefore, arrays cannot contain padding, though the elements within the arrays may contain padding.

Predicate 2: Any block of memory allocated as an array of characters through operator new[] (hereafter referred to as the block) is properly aligned for any object of that size or smaller

This follows from:

Consider: imaginary object type Element of a size which is a multiple of some actual object size; assume sizeof(Element) > POD_size

Note that an object of that size can exist. One object of that size is an array of the "actual" objects.

Note that the block is properly aligned for an Element. This directly follows from Predicate 2.

Corollary 1: The block is properly aligned for an array of Elements

This follows from Predicates 1 and 2, and the following quote:

[3.9/9] (Basic concepts::Types) "An object type is a (possibly cv-qualified) type that is not a function type, not a reference type, and not a void type." (Specifically, array types are object types.)

Corollary 2: For any pointer p and integer i, if p is properly aligned for the type it points to, then p + i (when well-defined) is properly aligned for that type; in other words, if an array is properly aligned, then each element in that array is properly aligned

There are no quotes from the Standard to directly support this argument, but it fits the common conception of the meaning of "alignment".

Note that the conditions for p + i being well-defined are outlined in [5.7/5]. We do not quote that here, but only make note that it is well-defined if p and p + i both point into or one past the same array.

Let: sizeof(Element) be the least common multiple of sizes of several actual objects (T1, T2, T3, ...)

Let: block be a pointer to the memory block, pe be (Element *) block, and pn be (Tn *) block

Corollary 3: For each integer i, such that pe + i is well-defined, then for each n, there exists some integer jn such that pn + jn is well-defined and refers to the same memory address as pe + i

This follows naturally, since the memory block is an array of Elements, and for each n, sizeof(Element) % sizeof(Tn) == 0; thus, the boundary of each element in the array of Elements is also a boundary of each element in each array of Tn.

Theorem: For each integer i, such that pe + i is well-defined, that address (pe + i) is properly aligned for each type Tn

Since pe + i is well-defined, then by Corollary 3, pn + jn is well-defined. It is properly aligned from Predicate 2 and Corollaries 1 and 2.

Use of the Theorem

The proof above covers alignment requirements for cutting chunks out of a block. The implementation uses actual object sizes of:

Each block also contains a pointer to the next block; but that is stored as a pointer to void and cast when necessary, to simplify alignment requirements to the three types above.

Therefore, alloc_size is defined to be the lcm of the sizes of the three types above.

A Look at the Memory Block

Each memory block consists of three main sections. The first section is the part that chunks are cut out of, and contains the interleaved free list. The second section is the pointer to the next block, and the third section is the size of the next block.

Each of these sections may contain padding as necessary to guarantee alignment for each of the next sections. The size of the first section is number_of_chunks * lcm(requested_size, sizeof(void *), sizeof(size_type)); the size of the second section is lcm(sizeof(void *), sizeof(size_type); and the size of the third section is sizeof(size_type).

Here's an example memory block, where requested_size == sizeof(void *) == sizeof(size_type) == 4:

Memory block containing 4 chunks, showing overlying array structures; FLP = Interleaved Free List Pointer
Sections size_type alignment void * alignment requested_size alignment
Memory not belonging to process
Chunks section (16 bytes) (4 bytes) FLP for Chunk 1 (4 bytes) Chunk 1 (4 bytes)
(4 bytes) FLP for Chunk 2 (4 bytes) Chunk 2 (4 bytes)
(4 bytes) FLP for Chunk 3 (4 bytes) Chunk 3 (4 bytes)
(4 bytes) FLP for Chunk 4 (4 bytes) Chunk 4 (4 bytes)
Pointer to next Block (4 bytes) (4 bytes) Pointer to next Block (4 bytes)
Size of next Block (4 bytes) Size of next Block (4 bytes)
Memory not belonging to process

To show a visual example of possible padding, here's an example memory block where requested_size == 8 and sizeof(void *) == sizeof(size_type) == 4:

Memory block containing 4 chunks, showing overlying array structures; FLP = Interleaved Free List Pointer
Sections size_type alignment void * alignment requested_size alignment
Memory not belonging to process
Chunks section (32 bytes) (4 bytes) FLP for Chunk 1 (4 bytes) Chunk 1 (8 bytes)
(4 bytes) (4 bytes)
(4 bytes) FLP for Chunk 2 (4 bytes) Chunk 2 (8 bytes)
(4 bytes) (4 bytes)
(4 bytes) FLP for Chunk 3 (4 bytes) Chunk 3 (8 bytes)
(4 bytes) (4 bytes)
(4 bytes) FLP for Chunk 4 (4 bytes) Chunk 4 (8 bytes)
(4 bytes) (4 bytes)
Pointer to next Block (4 bytes) (4 bytes) Pointer to next Block (4 bytes)
Size of next Block (4 bytes) Size of next Block (4 bytes)
Memory not belonging to process

Finally, here is a convoluted example where the requested_size is 7, sizeof(void *) == 3, and sizeof(size_type) == 5, showing how the least common multiple guarantees alignment requirements even in the oddest of circumstances:

Memory block containing 2 chunks, showing overlying array structures
Sections size_type alignment void * alignment requested_size alignment
Memory not belonging to process
Chunks section (210 bytes) (5 bytes) Interleaved free list pointer for Chunk 1 (15 bytes; 3 used) Chunk 1 (105 bytes; 7 used)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) Interleaved free list pointer for Chunk 2 (15 bytes; 3 used) Chunk 2 (105 bytes; 7 used)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
(5 bytes) (15 bytes)
(5 bytes)
(5 bytes)
Pointer to next Block (15 bytes; 3 used) (5 bytes) Pointer to next Block (15 bytes; 3 used)
(5 bytes)
(5 bytes)
Size of next Block (5 bytes; 5 used) Size of next Block (5 bytes; 5 used)
Memory not belonging to process

How Contiguous Chunks are Handled

The theorem above guarantees all alignment requirements for allocating chunks and also implementation details such as the interleaved free list. However, it does so by adding padding when necessary; therefore, we have to treat allocations of contiguous chunks in a different way.

Using array arguments similar to the above, we can translate any request for contiguous memory for n objects of requested_size into a request for m contiguous chunks. m is simply ceil(n * requested_size / alloc_size), where alloc_size is the actual size of the chunks. To illustrate:

Here's an example memory block, where requested_size == 1 and sizeof(void *) == sizeof(size_type) == 4:

Memory block containing 4 chunks; requested_size is 1
Sections size_type alignment void * alignment requested_size alignment
Memory not belonging to process
Chunks section (16 bytes) (4 bytes) FLP to Chunk 2 (4 bytes) Chunk 1 (4 bytes)
(4 bytes) FLP to Chunk 3 (4 bytes) Chunk 2 (4 bytes)
(4 bytes) FLP to Chunk 4 (4 bytes) Chunk 3 (4 bytes)
(4 bytes) FLP to end-of-list (4 bytes) Chunk 4 (4 bytes)
Pointer to next Block (4 bytes) (4 bytes) Ptr to end-of-list (4 bytes)
Size of next Block (4 bytes) 0 (4 bytes)
Memory not belonging to process

After user requests 7 contiguous elements of requested_size
Sections size_type alignment void * alignment requested_size alignment
Memory not belonging to process
Chunks section (16 bytes) (4 bytes) (4 bytes) 4 bytes in use by program
(4 bytes) (4 bytes) 3 bytes in use by program (1 byte unused)
(4 bytes) FLP to Chunk 4 (4 bytes) Chunk 3 (4 bytes)
(4 bytes) FLP to end-of-list (4 bytes) Chunk 4 (4 bytes)
Pointer to next Block (4 bytes) (4 bytes) Ptr to end-of-list (4 bytes)
Size of next Block (4 bytes) 0 (4 bytes)
Memory not belonging to process

Then, when the user deallocates the contiguous memory, we can split it up into chunks again.

Note that the implementation provided for allocating contiguous chunks uses a linear instead of quadratic algorithm. This means that it may not find contiguous free chunks if the free list is not ordered. Thus, it is recommended to always use an ordered free list when dealing with contiguous allocation of chunks. (In the example above, if Chunk 1 pointed to Chunk 3 pointed to Chunk 2 pointed to Chunk 4, instead of being in order, the contiguous allocation algorithm would have failed to find any of the contiguous chunks).


Valid HTML 4.01 Transitional

Revised 05 December, 2006

Copyright © 2000, 2001 Stephen Cleary (scleary AT jerviswebb DOT com)

Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)