Overview
Abstract
Boost.Endian provides facilities to manipulate the endianness of integers and user-defined types.
-
Three approaches to endianness are supported. Each has a long history of successful use, and each approach has use cases where it is preferred over the other approaches.
-
Primary uses:
-
Data portability. The Endian library supports binary data exchange, via either external media or network transmission, regardless of platform endianness.
-
Program portability. POSIX-based and Windows-based operating systems traditionally supply libraries with non-portable functions to perform endian conversion. There are at least four incompatible sets of functions in common use. The Endian library is portable across all C++ platforms.
-
-
Secondary use: Minimizing data size via sizes and/or alignments not supported by the standard C++ integer types.
Introduction to endianness
Consider the following code:
int16_t i = 0x0102;
FILE * file = fopen("test.bin", "wb"); // binary file!
fwrite(&i, sizeof(int16_t), 1, file);
fclose(file);
On OS X, Linux, or Windows systems with an Intel CPU, a hex dump of the "test.bin" output file produces:
0201
On OS X systems with a PowerPC CPU, or Solaris systems with a SPARC CPU, a hex dump of the "test.bin" output file produces:
0102
What’s happening here is that Intel CPUs order the bytes of an integer with the least-significant byte first, while SPARC CPUs place the most-significant byte first. Some CPUs, such as the PowerPC, allow the operating system to choose which ordering applies.
Most-significant-byte-first ordering is traditionally called "big endian" ordering and least-significant-byte-first is traditionally called "little-endian" ordering. The names are derived from Jonathan Swift's satirical novel Gulliver’s Travels, where rival kingdoms opened their soft-boiled eggs at different ends.
See Wikipedia’s Endianness article for an extensive discussion of endianness.
Programmers can usually ignore endianness, except when reading a core dump on little-endian systems. But programmers have to deal with endianness when exchanging binary integers and binary floating point values between computer systems with differing endianness, whether by physical file transfer or over a network. And programmers may also want to use the library when minimizing either internal or external data sizes is advantageous.
Introduction to the Boost.Endian library
Boost.Endian provides three different approaches to dealing with endianness. All three approaches support integers and user-define types (UDTs).
Each approach has a long history of successful use, and each approach has use cases where it is preferred to the other approaches. See Choosing between Conversion Functions, Buffer Types, and Arithmetic Types.
- Endian conversion functions
-
The application uses the built-in integer types to hold values, and calls the provided conversion functions to convert byte ordering as needed. Both mutating and non-mutating conversions are supplied, and each comes in unconditional and conditional variants.
- Endian buffer types
-
The application uses the provided endian buffer types to hold values, and explicitly converts to and from the built-in integer types. Buffer sizes of 8, 16, 24, 32, 40, 48, 56, and 64 bits (i.e. 1, 2, 3, 4, 5, 6, 7, and 8 bytes) are provided. Unaligned integer buffer types are provided for all sizes, and aligned buffer types are provided for 16, 32, and 64-bit sizes. The provided specific types are typedefs for a generic class template that may be used directly for less common use cases.
- Endian arithmetic types
-
The application uses the provided endian arithmetic types, which supply the same operations as the built-in C++ arithmetic types. All conversions are implicit. Arithmetic sizes of 8, 16, 24, 32, 40, 48, 56, and 64 bits (i.e. 1, 2, 3, 4, 5, 6, 7, and 8 bytes) are provided. Unaligned integer types are provided for all sizes and aligned arithmetic types are provided for 16, 32, and 64-bit sizes. The provided specific types are typedefs for a generic class template that may be used directly in generic code of for less common use cases.
Boost Endian is a header-only library. C++11 features affecting interfaces,
such as noexcept
, are used only if available. See
C++03 support for C++11 features for details.
Built-in support for Intrinsics
Most compilers, including GCC, Clang, and Visual C++, supply built-in support for byte swapping intrinsics. The Endian library uses these intrinsics when available since they may result in smaller and faster generated code, particularly for optimized builds.
Defining the macro BOOST_ENDIAN_NO_INTRINSICS
will suppress use of the
intrinsics. This is useful when a compiler has no intrinsic support or fails to
locate the appropriate header, perhaps because it is an older release or has
very limited supporting libraries.
The macro BOOST_ENDIAN_INTRINSIC_MSG
is defined as either
"no byte swap intrinsics"
or a string describing the particular set of
intrinsics being used. This is useful for eliminating missing intrinsics as a
source of performance issues.
Performance
Consider this problem:
Example 1
Add 100 to a big endian value in a file, then write the result to a file
Endian arithmetic type approach | Endian conversion function approach |
---|---|
big_int32_at x; ... read into x from a file ... x += 100; ... write x to a file ... |
int32_t x; ... read into x from a file ... big_to_native_inplace(x); x += 100; native_to_big_inplace(x); ... write x to a file ... |
There will be no performance difference between the two approaches in optimized builds, regardless of the native endianness of the machine. That’s because optimizing compilers will generate exactly the same code for each. That conclusion was confirmed by studying the generated assembly code for GCC and Visual C++. Furthermore, time spent doing I/O will determine the speed of this application.
Now consider a slightly different problem:
Example 2
Add a million values to a big endian value in a file, then write the result to a file
Endian arithmetic type approach | Endian conversion function approach |
---|---|
big_int32_at x; ... read into x from a file ... for (int32_t i = 0; i < 1000000; ++i) x += i; ... write x to a file ... |
int32_t x; ... read into x from a file ... big_to_native_inplace(x); for (int32_t i = 0; i < 1000000; ++i) x += i; native_to_big_inplace(x); ... write x to a file ... |
With the Endian arithmetic approach, on little endian platforms an implicit conversion from and then back to big endian is done inside the loop. With the Endian conversion function approach, the user has ensured the conversions are done outside the loop, so the code may run more quickly on little endian platforms.
Timings
These tests were run against release builds on a circa 2012 4-core little endian X64 Intel Core i5-3570K CPU @ 3.40GHz under Windows 7.
Caution
|
The Windows CPU timer has very high granularity. Repeated runs of the same tests often yield considerably different results. |
See test/loop_time_test.cpp
for the actual code and benchmark/Jamfile.v2
for
the build setup.
GNU C++ version 4.8.2 on Linux virtual machine
Iterations: 10'000'000'000, Intrinsics: __builtin_bswap16
, etc.
Test Case | Endian arithmetic type | Endian conversion function |
---|---|---|
16-bit aligned big endian |
8.46 s |
5.28 s |
16-bit aligned little endian |
5.28 s |
5.22 s |
32-bit aligned big endian |
8.40 s |
2.11 s |
32-bit aligned little endian |
2.11 s |
2.10 s |
64-bit aligned big endian |
14.02 s |
3.10 s |
64-bit aligned little endian |
3.00 s |
3.03 s |
Microsoft Visual C++ version 14.0
Iterations: 10'000'000'000, Intrinsics: <cstdlib>
_byteswap_ushort
, etc.
Test Case | Endian arithmetic type | Endian conversion function |
---|---|---|
16-bit aligned big endian |
8.27 s |
5.26 s |
16-bit aligned little endian |
5.29 s |
5.32 s |
32-bit aligned big endian |
8.36 s |
5.24 s |
32-bit aligned little endian |
5.24 s |
5.24 s |
64-bit aligned big endian |
13.65 s |
3.34 s |
64-bit aligned little endian |
3.35 s |
2.73 s |
C++03 support for C++11 features
C++11 Feature | Action with C++03 Compilers |
---|---|
Scoped enums |
Uses header boost/core/scoped_enum.hpp to emulate C++11 scoped enums. |
|
Uses |
C++11 PODs (N2342) |
Takes advantage of C++03 compilers that relax C++03 POD rules, but see Limitations here and here. Also see macros for explicit POD control here and here |
Overall FAQ
- Is the implementation header only?
-
Yes.
- Are C++03 compilers supported?
-
Yes.
- Does the implementation use compiler intrinsic built-in byte swapping?
-
Yes, if available. See Intrinsic built-in support.
- Why bother with endianness?
-
Binary data portability is the primary use case.
- Does endianness have any uses outside of portable binary file or network I/O formats?
-
Using the unaligned integer types with a size tailored to the application’s needs is a minor secondary use that saves internal or external memory space. For example, using
big_int40_buf_t
orbig_int40_t
in a large array saves a lot of space compared to one of the 64-bit types. - Why bother with binary I/O? Why not just use C++ Standard Library stream inserters and extractors?
-
-
Data interchange formats often specify binary integer data. Binary integer data is smaller and therefore I/O is faster and file sizes are smaller. Transfer between systems is less expensive.
-
Furthermore, binary integer data is of fixed size, and so fixed-size disk records are possible without padding, easing sorting and allowing random access.
-
Disadvantages, such as the inability to use text utilities on the resulting files, limit usefulness to applications where the binary I/O advantages are paramount.
-
- Which is better, big-endian or little-endian?
-
Big-endian tends to be preferred in a networking environment and is a bit more of an industry standard, but little-endian may be preferred for applications that run primarily on x86, x86-64, and other little-endian CPU’s. The Wikipedia article gives more pros and cons.
- Why are only big and little native endianness supported?
-
These are the only endian schemes that have any practical value today. PDP-11 and the other middle endian approaches are interesting curiosities but have no relevance for today’s C++ developers. The same is true for architectures that allow runtime endianness switching. The specification for native ordering has been carefully crafted to allow support for such orderings in the future, should the need arise. Thanks to Howard Hinnant for suggesting this.
- Why do both the buffer and arithmetic types exist?
-
Conversions in the buffer types are explicit. Conversions in the arithmetic types are implicit. This fundamental difference is a deliberate design feature that would be lost if the inheritance hierarchy were collapsed. The original design provided only arithmetic types. Buffer types were requested during formal review by those wishing total control over when conversion occurs. They also felt that buffer types would be less likely to be misused by maintenance programmers not familiar with the implications of performing a lot of integer operations on the endian arithmetic integer types.
- What is gained by using the buffer types rather than always just using the arithmetic types?
-
Assurance that hidden conversions are not performed. This is of overriding importance to users concerned about achieving the ultimate in terms of speed. "Always just using the arithmetic types" is fine for other users. When the ultimate in speed needs to be ensured, the arithmetic types can be used in the same design patterns or idioms that would be used for buffer types, resulting in the same code being generated for either types.
- What are the limitations of integer support?
-
Tests have only been performed on machines that use two’s complement arithmetic. The Endian conversion functions only support 8, 16, 32, and 64-bit aligned integers. The endian types only support 8, 16, 24, 32, 40, 48, 56, and 64-bit unaligned integers, and 8, 16, 32, and 64-bit aligned integers.
- Is there floating point support?
-
An attempt was made to support four-byte
float
s and eight-bytedouble
s, limited to IEEE 754 (also known as ISO/IEC/IEEE 60559) floating point and further limited to systems where floating point endianness does not differ from integer endianness. Even with those limitations, support for floating point types was not reliable and was removed. For example, simply reversing the endianness of a floating point number can result in a signaling-NAN.Support for
float
anddouble
has since been reinstated forendian_buffer
,endian_arithmetic
and the conversion functions that reverse endianness in place. The conversion functions that take and return by value still do not support floating point due to the above issues; reversing the bytes of a floating point number does not necessarily produce another valid floating point number.
Revision History
Changes in 1.74.0
-
Enabled scoped enumeration types in
endian_reverse
-
Enabled
bool
,enum
,float
,double
inendian_reverse_inplace
-
Added an overload of
endian_reverse_inplace
for arrays
Changes in 1.72.0
-
Made
endian_reverse
,conditional_reverse
and*_to_*
constexpr
on GCC and Clang -
Added convenience load and store functions
-
Added floating point convenience typedefs
-
Added a non-const overload of
data()
; changed its return type tounsigned char*
-
Added
__int128
support toendian_reverse
when available -
Added a convenience header
boost/endian.hpp
Changes in 1.71.0
-
Clarified requirements on the value type template parameter
-
Added support for
float
anddouble
toendian_buffer
andendian_arithmetic
-
Added
endian_load
,endian_store
-
Updated
endian_reverse
to correctly support all non-bool
integral types -
Moved deprecated names to the deprecated header
endian.hpp
Choosing between Conversion Functions, Buffer Types, and Arithmetic Types
Note
|
Deciding which is the best endianness approach (conversion functions, buffer types, or arithmetic types) for a particular application involves complex engineering trade-offs. It is hard to assess those trade-offs without some understanding of the different interfaces, so you might want to read the conversion functions, buffer types, and arithmetic types pages before proceeding. |
The best approach to endianness for a particular application depends on the interaction between the application’s needs and the characteristics of each of the three approaches.
Recommendation: If you are new to endianness, uncertain, or don’t want to invest the time to study engineering trade-offs, use endian arithmetic types. They are safe, easy to use, and easy to maintain. Use the anticipating need design pattern locally around performance hot spots like lengthy loops, if needed.
Background
A dealing with endianness usually implies a program portability or a data portability requirement, and often both. That means real programs dealing with endianness are usually complex, so the examples shown here would really be written as multiple functions spread across multiple translation units. They would involve interfaces that can not be altered as they are supplied by third-parties or the standard library.
Characteristics
The characteristics that differentiate the three approaches to endianness are the endianness invariants, conversion explicitness, arithmetic operations, sizes available, and alignment requirements.
Endianness invariants
Endian conversion functions use objects of the ordinary C++ arithmetic types
like int
or unsigned short
to hold values. That breaks the implicit
invariant that the C++ language rules apply. The usual language rules only apply
if the endianness of the object is currently set to the native endianness for
the platform. That can make it very hard to reason about logic flow, and result
in difficult to find bugs.
For example:
struct data_t // big endian
{
int32_t v1; // description ...
int32_t v2; // description ...
... additional character data members (i.e. non-endian)
int32_t v3; // description ...
};
data_t data;
read(data);
big_to_native_inplace(data.v1);
big_to_native_inplace(data.v2);
...
++v1;
third_party::func(data.v2);
...
native_to_big_inplace(data.v1);
native_to_big_inplace(data.v2);
write(data);
The programmer didn’t bother to convert data.v3
to native endianness because
that member isn’t used. A later maintainer needs to pass data.v3
to the
third-party function, so adds third_party::func(data.v3);
somewhere deep in
the code. This causes a silent failure because the usual invariant that an
object of type int32_t
holds a value as described by the C++ core language
does not apply.
Endian buffer and arithmetic types hold values internally as arrays of characters with an invariant that the endianness of the array never changes. That makes these types easier to use and programs easier to maintain.
Here is the same example, using an endian arithmetic type:
struct data_t
{
big_int32_t v1; // description ...
big_int32_t v2; // description ...
... additional character data members (i.e. non-endian)
big_int32_t v3; // description ...
};
data_t data;
read(data);
...
++v1;
third_party::func(data.v2);
...
write(data);
A later maintainer can add third_party::func(data.v3)
and it will just-work.
Conversion explicitness
Endian conversion functions and buffer types never perform implicit conversions. This gives users explicit control of when conversion occurs, and may help avoid unnecessary conversions.
Endian arithmetic types perform conversion implicitly. That makes these types very easy to use, but can result in unnecessary conversions. Failure to hoist conversions out of inner loops can bring a performance penalty.
Arithmetic operations
Endian conversion functions do not supply arithmetic operations, but this is not a concern since this approach uses ordinary C++ arithmetic types to hold values.
Endian buffer types do not supply arithmetic operations. Although this approach avoids unnecessary conversions, it can result in the introduction of additional variables and confuse maintenance programmers.
Endian arithmetic types do supply arithmetic operations. They are very easy to use if lots of arithmetic is involved.
Sizes
Endianness conversion functions only support 1, 2, 4, and 8 byte integers. That’s sufficient for many applications.
Endian buffer and arithmetic types support 1, 2, 3, 4, 5, 6, 7, and 8 byte integers. For an application where memory use or I/O speed is the limiting factor, using sizes tailored to application needs can be useful.
Alignments
Endianness conversion functions only support aligned integer and floating-point types. That’s sufficient for most applications.
Endian buffer and arithmetic types support both aligned and unaligned integer and floating-point types. Unaligned types are rarely needed, but when needed they are often very useful and workarounds are painful. For example:
Non-portable code like this:
struct S {
uint16_t a; // big endian
uint32_t b; // big endian
} __attribute__ ((packed));
Can be replaced with portable code like this:
struct S {
big_uint16_ut a;
big_uint32_ut b;
};
Design patterns
Applications often traffic in endian data as records or packets containing multiple endian data elements. For simplicity, we will just call them records.
If desired endianness differs from native endianness, a conversion has to be performed. When should that conversion occur? Three design patterns have evolved.
Convert only as needed (i.e. lazy)
This pattern defers conversion to the point in the code where the data element is actually used.
This pattern is appropriate when which endian element is actually used varies greatly according to record content or other circumstances
Convert in anticipation of need
This pattern performs conversion to native endianness in anticipation of use, such as immediately after reading records. If needed, conversion to the output endianness is performed after all possible needs have passed, such as just before writing records.
One implementation of this pattern is to create a proxy record with endianness converted to native in a read function, and expose only that proxy to the rest of the implementation. If a write function, if needed, handles the conversion from native to the desired output endianness.
This pattern is appropriate when all endian elements in a record are typically used regardless of record content or other circumstances.
Convert only as needed, except locally in anticipation of need
This pattern in general defers conversion but for specific local needs does anticipatory conversion. Although particularly appropriate when coupled with the endian buffer or arithmetic types, it also works well with the conversion functions.
Example:
struct data_t
{
big_int32_t v1;
big_int32_t v2;
big_int32_t v3;
};
data_t data;
read(data);
...
++v1;
...
int32_t v3_temp = data.v3; // hoist conversion out of loop
for (int32_t i = 0; i < large-number
; ++i)
{
... lengthy computation that accesses v3_temp
...
}
data.v3 = v3_temp;
write(data);
In general the above pseudo-code leaves conversion up to the endian arithmetic
type big_int32_t
. But to avoid conversion inside the loop, a temporary is
created before the loop is entered, and then used to set the new value of
data.v3
after the loop is complete.
Question: Won’t the compiler’s optimizer hoist the conversion out of the loop anyhow?
Answer: VC++ 2015 Preview, and probably others, does not, even for a toy test
program. Although the savings is small (two register bswap
instructions), the
cost might be significant if the loop is repeated enough times. On the other
hand, the program may be so dominated by I/O time that even a lengthy loop will
be immaterial.
Use case examples
Porting endian unaware codebase
An existing codebase runs on big endian systems. It does not currently deal with endianness. The codebase needs to be modified so it can run on little endian systems under various operating systems. To ease transition and protect value of existing files, external data will continue to be maintained as big endian.
The endian arithmetic approach is recommended to meet these
needs. A relatively small number of header files dealing with binary I/O layouts
need to change types. For example, short
or int16_t
would change to
big_int16_t
. No changes are required for .cpp
files.
Porting endian aware codebase
An existing codebase runs on little-endian Linux systems. It already deals with endianness via Linux provided functions. Because of a business merger, the codebase has to be quickly modified for Windows and possibly other operating systems, while still supporting Linux. The codebase is reliable and the programmers are all well-aware of endian issues.
These factors all argue for an endian conversion approach that
just mechanically changes the calls to htobe32
, etc. to
boost::endian::native_to_big
, etc. and replaces <endian.h>
with
<boost/endian/conversion.hpp>
.
Reliability and arithmetic-speed
A new, complex, multi-threaded application is to be developed that must run on little endian machines, but do big endian network I/O. The developers believe computational speed for endian variable is critical but have seen numerous bugs result from inability to reason about endian conversion state. They are also worried that future maintenance changes could inadvertently introduce a lot of slow conversions if full-blown endian arithmetic types are used.
The endian buffers approach is made-to-order for this use case.
Reliability and ease-of-use
A new, complex, multi-threaded application is to be developed that must run on little endian machines, but do big endian network I/O. The developers believe computational speed for endian variables is not critical but have seen numerous bugs result from inability to reason about endian conversion state. They are also concerned about ease-of-use both during development and long-term maintenance.
Removing concern about conversion speed and adding concern about ease-of-use tips the balance strongly in favor the endian arithmetic approach.
Endian Conversion Functions
Introduction
Header boost/endian/conversion.hpp
provides byte order reversal and conversion
functions that convert objects of the built-in integer types between native,
big, or little endian byte ordering. User defined types are also supported.
Reference
Functions are implemented inline
if appropriate. For C++03 compilers,
noexcept
is elided. Boost scoped enum emulation is used so that the library
still works for compilers that do not support scoped enums.
Definitions
Endianness refers to the ordering of bytes within internal or external integers and other arithmetic data. Most-significant byte first is called big endian ordering. Least-significant byte first is called little endian ordering. Other orderings are possible and some CPU architectures support both big and little ordering.
Note
|
The names are derived from Jonathan Swift's satirical novel Gulliver’s Travels, where rival kingdoms opened their soft-boiled eggs at different ends. Wikipedia has an extensive description of Endianness. |
The standard integral types (C++std [basic.fundamental]) except bool
and
the scoped enumeration types (C++std [dcl.enum]) are collectively called the
endian types. In the absence of padding bits, which is true on the platforms
supported by the Boost.Endian library, endian types have the property that all
of their bit patterns are valid values, which means that when an object of an
endian type has its constituent bytes reversed, the result is another valid value.
This allows endian_reverse
to take and return by value.
Other built-in types, such as bool
, float
, or unscoped enumerations, do not
have the same property, which means that reversing their constituent bytes may
produce an invalid value, leading to undefined behavior. These types are therefore
disallowed in endian_reverse
, but are still allowed in endian_reverse_inplace
.
Even if an object becomes invalid as a result of reversing its bytes, as long as
its value is never read, there would be no undefined behavior.
Header <boost/endian/conversion.hpp>
Synopsis
#define BOOST_ENDIAN_INTRINSIC_MSG \
“message describing presence or absence of intrinsics”
namespace boost
{
namespace endian
{
enum class order
{
native = see below
,
big = see below
,
little = see below
,
};
// Byte reversal functions
template <class Endian>
Endian endian_reverse(Endian x) noexcept;
template <class EndianReversible>
EndianReversible big_to_native(EndianReversible x) noexcept;
template <class EndianReversible>
EndianReversible native_to_big(EndianReversible x) noexcept;
template <class EndianReversible>
EndianReversible little_to_native(EndianReversible x) noexcept;
template <class EndianReversible>
EndianReversible native_to_little(EndianReversible x) noexcept;
template <order O1, order O2, class EndianReversible>
EndianReversible conditional_reverse(EndianReversible x) noexcept;
template <class EndianReversible>
EndianReversible conditional_reverse(EndianReversible x,
order order1, order order2) noexcept;
// In-place byte reversal functions
template <class EndianReversible>
void endian_reverse_inplace(EndianReversible& x) noexcept;
template<class EndianReversibleInplace, std::size_t N>
void endian_reverse_inplace(EndianReversibleInplace (&x)[N]) noexcept;
template <class EndianReversibleInplace>
void big_to_native_inplace(EndianReversibleInplace& x) noexcept;
template <class EndianReversibleInplace>
void native_to_big_inplace(EndianReversibleInplace& x) noexcept;
template <class EndianReversibleInplace>
void little_to_native_inplace(EndianReversibleInplace& x) noexcept;
template <class EndianReversibleInplace>
void native_to_little_inplace(EndianReversibleInplace& x) noexcept;
template <order O1, order O2, class EndianReversibleInplace>
void conditional_reverse_inplace(EndianReversibleInplace& x) noexcept;
template <class EndianReversibleInplace>
void conditional_reverse_inplace(EndianReversibleInplace& x,
order order1, order order2) noexcept;
// Generic load and store functions
template<class T, std::size_t N, order Order>
T endian_load( unsigned char const * p ) noexcept;
template<class T, std::size_t N, order Order>
void endian_store( unsigned char * p, T const & v ) noexcept;
// Convenience load functions
boost::int16_t load_little_s16( unsigned char const * p ) noexcept;
boost::uint16_t load_little_u16( unsigned char const * p ) noexcept;
boost::int16_t load_big_s16( unsigned char const * p ) noexcept;
boost::uint16_t load_big_u16( unsigned char const * p ) noexcept;
boost::int32_t load_little_s24( unsigned char const * p ) noexcept;
boost::uint32_t load_little_u24( unsigned char const * p ) noexcept;
boost::int32_t load_big_s24( unsigned char const * p ) noexcept;
boost::uint32_t load_big_u24( unsigned char const * p ) noexcept;
boost::int32_t load_little_s32( unsigned char const * p ) noexcept;
boost::uint32_t load_little_u32( unsigned char const * p ) noexcept;
boost::int32_t load_big_s32( unsigned char const * p ) noexcept;
boost::uint32_t load_big_u32( unsigned char const * p ) noexcept;
boost::int64_t load_little_s40( unsigned char const * p ) noexcept;
boost::uint64_t load_little_u40( unsigned char const * p ) noexcept;
boost::int64_t load_big_s40( unsigned char const * p ) noexcept;
boost::uint64_t load_big_u40( unsigned char const * p ) noexcept;
boost::int64_t load_little_s48( unsigned char const * p ) noexcept;
boost::uint64_t load_little_u48( unsigned char const * p ) noexcept;
boost::int64_t load_big_s48( unsigned char const * p ) noexcept;
boost::uint64_t load_big_u48( unsigned char const * p ) noexcept;
boost::int64_t load_little_s56( unsigned char const * p ) noexcept;
boost::uint64_t load_little_u56( unsigned char const * p ) noexcept;
boost::int64_t load_big_s56( unsigned char const * p ) noexcept;
boost::uint64_t load_big_u56( unsigned char const * p ) noexcept;
boost::int64_t load_little_s64( unsigned char const * p ) noexcept;
boost::uint64_t load_little_u64( unsigned char const * p ) noexcept;
boost::int64_t load_big_s64( unsigned char const * p ) noexcept;
boost::uint64_t load_big_u64( unsigned char const * p ) noexcept;
// Convenience store functions
void store_little_s16( unsigned char * p, boost::int16_t v ) noexcept;
void store_little_u16( unsigned char * p, boost::uint16_t v ) noexcept;
void store_big_s16( unsigned char * p, boost::int16_t v ) noexcept;
void store_big_u16( unsigned char * p, boost::uint16_t v ) noexcept;
void store_little_s24( unsigned char * p, boost::int32_t v ) noexcept;
void store_little_u24( unsigned char * p, boost::uint32_t v ) noexcept;
void store_big_s24( unsigned char * p, boost::int32_t v ) noexcept;
void store_big_u24( unsigned char * p, boost::uint32_t v ) noexcept;
void store_little_s32( unsigned char * p, boost::int32_t v ) noexcept;
void store_little_u32( unsigned char * p, boost::uint32_t v ) noexcept;
void store_big_s32( unsigned char * p, boost::int32_t v ) noexcept;
void store_big_u32( unsigned char * p, boost::uint32_t v ) noexcept;
void store_little_s40( unsigned char * p, boost::int64_t v ) noexcept;
void store_little_u40( unsigned char * p, boost::uint64_t v ) noexcept;
void store_big_s40( unsigned char * p, boost::int64_t v ) noexcept;
void store_big_u40( unsigned char * p, boost::uint64_t v ) noexcept;
void store_little_s48( unsigned char * p, boost::int64_t v ) noexcept;
void store_little_u48( unsigned char * p, boost::uint64_t v ) noexcept;
void store_big_s48( unsigned char * p, boost::int64_t v ) noexcept;
void store_big_u48( unsigned char * p, boost::uint64_t v ) noexcept;
void store_little_s56( unsigned char * p, boost::int64_t v ) noexcept;
void store_little_u56( unsigned char * p, boost::uint64_t v ) noexcept;
void store_big_s56( unsigned char * p, boost::int64_t v ) noexcept;
void store_big_u56( unsigned char * p, boost::uint64_t v ) noexcept;
void store_little_s64( unsigned char * p, boost::int64_t v ) noexcept;
void store_little_u64( unsigned char * p, boost::uint64_t v ) noexcept;
void store_big_s64( unsigned char * p, boost::int64_t v ) noexcept;
void store_big_u64( unsigned char * p, boost::uint64_t v ) noexcept;
} // namespace endian
} // namespace boost
The values of order::little
and order::big
shall not be equal to one
another.
The value of order::native
shall be:
-
equal to
order::big
if the execution environment is big endian, otherwise -
equal to
order::little
if the execution environment is little endian, otherwise -
unequal to both
order::little
andorder::big
.
Requirements
Template argument requirements
The template definitions in the boost/endian/conversion.hpp
header refer to
various named requirements whose details are set out in the tables in this
subsection. In these tables, T
is an object or reference type to be supplied
by a C++ program instantiating a template; x
is a value of type (possibly
const
) T
; mlx
is a modifiable lvalue of type T
.
EndianReversible requirements (in addition to CopyConstructible
)
Expression | Return | Requirements |
---|---|---|
|
|
If If
|
EndianReversibleInplace requirements
Expression | Requirements |
---|---|
|
If If
If |
Note
|
Because there is a function template for endian_reverse_inplace that
calls endian_reverse for class types, only endian_reverse is required for a
user-defined type to meet the EndianReversibleInplace requirements. Although
user-defined types are not required to supply an endian_reverse_inplace function,
doing so may improve efficiency.
|
Customization points for user-defined types (UDTs)
This subsection describes requirements on the Endian library’s implementation.
The library’s function templates requiring
EndianReversible
are required to perform
reversal of endianness if needed by making an unqualified call to
endian_reverse()
.
The library’s function templates requiring
EndianReversibleInplace
are required to
perform reversal of endianness if needed by making an unqualified call to
endian_reverse_inplace()
.
See example/udt_conversion_example.cpp
for an example user-defined type.
Byte Reversal Functions
template <class Endian>
Endian endian_reverse(Endian x) noexcept;
-
- Requires
-
Endian
must be a standard integral type that is notbool
, or a scoped enumeration type. - Returns
-
x
, with the order of its constituent bytes reversed.
template <class EndianReversible>
EndianReversible big_to_native(EndianReversible x) noexcept;
-
- Returns
-
conditional_reverse<order::big, order::native>(x)
.
template <class EndianReversible>
EndianReversible native_to_big(EndianReversible x) noexcept;
-
- Returns
-
conditional_reverse<order::native, order::big>(x)
.
template <class EndianReversible>
EndianReversible little_to_native(EndianReversible x) noexcept;
-
- Returns
-
conditional_reverse<order::little, order::native>(x)
.
template <class EndianReversible>
EndianReversible native_to_little(EndianReversible x) noexcept;
-
- Returns
-
conditional_reverse<order::native, order::little>(x)
.
template <order O1, order O2, class EndianReversible>
EndianReversible conditional_reverse(EndianReversible x) noexcept;
-
- Returns
-
x
ifO1 == O2,
otherwiseendian_reverse(x)
. - Remarks
-
Whether
x
orendian_reverse(x)
is to be returned shall be determined at compile time.
template <class EndianReversible>
EndianReversible conditional_reverse(EndianReversible x,
order order1, order order2) noexcept;
-
- Returns
-
order1 == order2? x: endian_reverse(x)
.
In-place Byte Reversal Functions
template <class EndianReversible>
void endian_reverse_inplace(EndianReversible& x) noexcept;
-
- Effects
-
When
EndianReversible
is a class type,x = endian_reverse(x);
. WhenEndianReversible
is an integral type, an enumeration type,float
, ordouble
, reverses the order of the constituent bytes ofx
. Otherwise, the program is ill-formed.
template<class EndianReversibleInplace, std::size_t N>
void endian_reverse_inplace(EndianReversibleInplace (&x)[N]) noexcept;
-
- Effects
-
Calls
endian_reverse_inplace(x[i])
fori
from0
toN-1
.
template <class EndianReversibleInplace>
void big_to_native_inplace(EndianReversibleInplace& x) noexcept;
-
- Effects
-
conditional_reverse_inplace<order::big, order::native>(x)
.
template <class EndianReversibleInplace>
void native_to_big_inplace(EndianReversibleInplace& x) noexcept;
-
- Effects
-
conditional_reverse_inplace<order::native, order::big>(x)
.
template <class EndianReversibleInplace>
void little_to_native_inplace(EndianReversibleInplace& x) noexcept;
-
- Effects
-
conditional_reverse_inplace<order::little, order::native>(x)
.
template <class EndianReversibleInplace>
void native_to_little_inplace(EndianReversibleInplace& x) noexcept;
-
- Effects
-
conditional_reverse_inplace<order::native, order::little>(x)
.
template <order O1, order O2, class EndianReversibleInplace>
void conditional_reverse_inplace(EndianReversibleInplace& x) noexcept;
-
- Effects
-
None if
O1 == O2,
otherwiseendian_reverse_inplace(x)
. - Remarks
-
Which effect applies shall be determined at compile time.
template <class EndianReversibleInplace>
void conditional_reverse_inplace(EndianReversibleInplace& x,
order order1, order order2) noexcept;
-
- Effects
-
If
order1 == order2
thenendian_reverse_inplace(x)
.
Generic Load and Store Functions
template<class T, std::size_t N, order Order>
T endian_load( unsigned char const * p ) noexcept;
-
- Requires
-
sizeof(T)
must be 1, 2, 4, or 8.N
must be between 1 andsizeof(T)
, inclusive.T
must be trivially copyable. IfN
is not equal tosizeof(T)
,T
must be integral orenum
. - Effects
-
Reads
N
bytes starting fromp
, in forward or reverse order depending on whetherOrder
matches the native endianness or not, interprets the resulting bit pattern as a value of typeT
, and returns it. Ifsizeof(T)
is bigger thanN
, zero-extends whenT
is unsigned, sign-extends otherwise.
template<class T, std::size_t N, order Order>
void endian_store( unsigned char * p, T const & v ) noexcept;
-
- Requires
-
sizeof(T)
must be 1, 2, 4, or 8.N
must be between 1 andsizeof(T)
, inclusive.T
must be trivially copyable. IfN
is not equal tosizeof(T)
,T
must be integral orenum
. - Effects
-
Writes to
p
theN
least significant bytes from the object representation ofv
, in forward or reverse order depending on whetherOrder
matches the native endianness or not.
Convenience Load Functions
inline boost::intM_t load_little_sN( unsigned char const * p ) noexcept;
-
Reads an N-bit signed little-endian integer from
p
.- Returns
-
endian_load<boost::intM_t, N/8, order::little>( p )
.
inline boost::uintM_t load_little_uN( unsigned char const * p ) noexcept;
-
Reads an N-bit unsigned little-endian integer from
p
.- Returns
-
endian_load<boost::uintM_t, N/8, order::little>( p )
.
inline boost::intM_t load_big_sN( unsigned char const * p ) noexcept;
-
Reads an N-bit signed big-endian integer from
p
.- Returns
-
endian_load<boost::intM_t, N/8, order::big>( p )
.
inline boost::uintM_t load_big_uN( unsigned char const * p ) noexcept;
-
Reads an N-bit unsigned big-endian integer from
p
.- Returns
-
endian_load<boost::uintM_t, N/8, order::big>( p )
.
Convenience Store Functions
inline void store_little_sN( unsigned char * p, boost::intM_t v ) noexcept;
-
Writes an N-bit signed little-endian integer to
p
.- Effects
-
endian_store<boost::intM_t, N/8, order::little>( p, v )
.
inline void store_little_uN( unsigned char * p, boost::uintM_t v ) noexcept;
-
Writes an N-bit unsigned little-endian integer to
p
.- Effects
-
endian_store<boost::uintM_t, N/8, order::little>( p, v )
.
inline void store_big_sN( unsigned char * p, boost::intM_t v ) noexcept;
-
Writes an N-bit signed big-endian integer to
p
.- Effects
-
endian_store<boost::intM_t, N/8, order::big>( p, v )
.
inline void store_big_uN( unsigned char * p, boost::uintM_t v ) noexcept;
-
Writes an N-bit unsigned big-endian integer to
p
.- Effects
-
endian_store<boost::uintM_t, N/8, order::big>( p, v )
.
FAQ
See the Overview FAQ for a library-wide FAQ.
- Why are both value returning and modify-in-place functions provided?
-
Returning the result by value is the standard C and C++ idiom for functions that compute a value from an argument. Modify-in-place functions allow cleaner code in many real-world endian use cases and are more efficient for user-defined types that have members such as string data that do not need to be reversed. Thus both forms are provided.
- Why not use the Linux names (htobe16, htole16, be16toh, le16toh, etc.) ?
-
Those names are non-standard and vary even between POSIX-like operating systems. A C++ library TS was going to use those names, but found they were sometimes implemented as macros. Since macros do not respect scoping and namespace rules, to use them would be very error prone.
Acknowledgements
Tomas Puverle was instrumental in identifying and articulating the need to
support endian conversion as separate from endian integer types. Phil Endecott
suggested the form of the value returning signatures. Vicente Botet and other
reviewers suggested supporting user defined types. General reverse template
implementation approach using std::reverse
suggested by Mathias Gaunard.
Portable implementation approach for 16, 32, and 64-bit integers suggested by
tymofey, with avoidance of undefined behavior as suggested by Giovanni Piero
Deretta, and a further refinement suggested by Pyry Jahkola. Intrinsic builtins
implementation approach for 16, 32, and 64-bit integers suggested by several
reviewers, and by David Stone, who provided his Boost licensed macro
implementation that became the starting point for
boost/endian/detail/intrinsic.hpp
. Pierre Talbot provided the
int8_t endian_reverse()
and templated endian_reverse_inplace()
implementations.
Endian Buffer Types
Introduction
The internal byte order of arithmetic types is traditionally called endianness. See the Wikipedia for a full exploration of endianness, including definitions of big endian and little endian.
Header boost/endian/buffers.hpp
provides endian_buffer
, a portable endian
integer binary buffer class template with control over byte order, value type,
size, and alignment independent of the platform’s native endianness. Typedefs
provide easy-to-use names for common configurations.
Use cases primarily involve data portability, either via files or network connections, but these byte-holders may also be used to reduce memory use, file size, or network activity since they provide binary numeric sizes not otherwise available.
Class endian_buffer
is aimed at users who wish explicit control over when
endianness conversions occur. It also serves as the base class for the
endian_arithmetic class template, which is aimed at users who
wish fully automatic endianness conversion and direct support for all normal
arithmetic operations.
Example
The example/endian_example.cpp
program writes a binary file containing
four-byte, big-endian and little-endian integers:
#include <iostream>
#include <cstdio>
#include <boost/endian/buffers.hpp> // see Synopsis below
#include <boost/static_assert.hpp>
using namespace boost::endian;
namespace
{
// This is an extract from a very widely used GIS file format.
// Why the designer decided to mix big and little endians in
// the same file is not known. But this is a real-world format
// and users wishing to write low level code manipulating these
// files have to deal with the mixed endianness.
struct header
{
big_int32_buf_t file_code;
big_int32_buf_t file_length;
little_int32_buf_t version;
little_int32_buf_t shape_type;
};
const char* filename = "test.dat";
}
int main(int, char* [])
{
header h;
BOOST_STATIC_ASSERT(sizeof(h) == 16U); // reality check
h.file_code = 0x01020304;
h.file_length = sizeof(header);
h.version = 1;
h.shape_type = 0x01020304;
// Low-level I/O such as POSIX read/write or <cstdio>
// fread/fwrite is sometimes used for binary file operations
// when ultimate efficiency is important. Such I/O is often
// performed in some C++ wrapper class, but to drive home the
// point that endian integers are often used in fairly
// low-level code that does bulk I/O operations, <cstdio>
// fopen/fwrite is used for I/O in this example.
std::FILE* fi = std::fopen(filename, "wb"); // MUST BE BINARY
if (!fi)
{
std::cout << "could not open " << filename << '\n';
return 1;
}
if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
{
std::cout << "write failure for " << filename << '\n';
return 1;
}
std::fclose(fi);
std::cout << "created file " << filename << '\n';
return 0;
}
After compiling and executing example/endian_example.cpp
, a hex dump of
test.dat
shows:
01020304 00000010 01000000 04030201
Notice that the first two 32-bit integers are big endian while the second two are little endian, even though the machine this was compiled and run on was little endian.
Limitations
Requires <climits>
, CHAR_BIT == 8
. If CHAR_BIT
is some other value,
compilation will result in an #error
. This restriction is in place because the
design, implementation, testing, and documentation has only considered issues
related to 8-bit bytes, and there have been no real-world use cases presented
for other sizes.
In C++03, endian_buffer
does not meet the requirements for POD types because
it has constructors and a private data member. This means that
common use cases are relying on unspecified behavior in that the C++ Standard
does not guarantee memory layout for non-POD types. This has not been a problem
in practice since all known C++ compilers lay out memory as if endian
were
a POD type. In C++11, it is possible to specify the default constructor as
trivial, and private data members and base classes no longer disqualify a type
from being a POD type. Thus under C++11, endian_buffer
will no longer be
relying on unspecified behavior.
Feature set
-
Big endian| little endian | native endian byte ordering.
-
Signed | unsigned
-
Unaligned | aligned
-
1-8 byte (unaligned) | 1, 2, 4, 8 byte (aligned)
-
Choice of value type
Enums and typedefs
Two scoped enums are provided:
enum class order { big, little, native };
enum class align { no, yes };
One class template is provided:
template <order Order, typename T, std::size_t Nbits,
align Align = align::no>
class endian_buffer;
Typedefs, such as big_int32_buf_t
, provide convenient naming conventions for
common use cases:
Name | Alignment | Endianness | Sign | Sizes in bits (n) |
---|---|---|---|---|
|
no |
big |
signed |
8,16,24,32,40,48,56,64 |
|
no |
big |
unsigned |
8,16,24,32,40,48,56,64 |
|
no |
little |
signed |
8,16,24,32,40,48,56,64 |
|
no |
little |
unsigned |
8,16,24,32,40,48,56,64 |
|
no |
native |
signed |
8,16,24,32,40,48,56,64 |
|
no |
native |
unsigned |
8,16,24,32,40,48,56,64 |
|
yes |
big |
signed |
8,16,32,64 |
|
yes |
big |
unsigned |
8,16,32,64 |
|
yes |
little |
signed |
8,16,32,64 |
|
yes |
little |
unsigned |
8,16,32,64 |
The unaligned types do not cause compilers to insert padding bytes in classes and structs. This is an important characteristic that can be exploited to minimize wasted space in memory, files, and network transmissions.
Caution
|
Code that uses aligned types is possibly non-portable because alignment requirements vary between hardware architectures and because alignment may be affected by compiler switches or pragmas. For example, alignment of an 64-bit integer may be to a 32-bit boundary on a 32-bit machine and to a 64-bit boundary on a 64-bit machine. Furthermore, aligned types are only available on architectures with 8, 16, 32, and 64-bit integer types. |
Tip
|
Prefer unaligned buffer types. |
Tip
|
Protect yourself against alignment ills. For example: |
-
static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong");
Note: One-byte big and little buffer types have identical layout on all platforms, so they never actually reverse endianness. They are provided to enable generic code, and to improve code readability and searchability.
Class template endian_buffer
An endian_buffer
is a byte-holder for arithmetic types with
user-specified endianness, value type, size, and alignment.
Synopsis
namespace boost
{
namespace endian
{
// C++11 features emulated if not available
enum class align { no, yes };
template <order Order, class T, std::size_t Nbits,
align Align = align::no>
class endian_buffer
{
public:
typedef T value_type;
endian_buffer() noexcept = default;
explicit endian_buffer(T v) noexcept;
endian_buffer& operator=(T v) noexcept;
value_type value() const noexcept;
unsigned char* data() noexcept;
unsigned char const* data() const noexcept;
private:
unsigned char value_[Nbits / CHAR_BIT]; // exposition only
};
// stream inserter
template <class charT, class traits, order Order, class T,
std::size_t n_bits, align Align>
std::basic_ostream<charT, traits>&
operator<<(std::basic_ostream<charT, traits>& os,
const endian_buffer<Order, T, n_bits, Align>& x);
// stream extractor
template <class charT, class traits, order Order, class T,
std::size_t n_bits, align A>
std::basic_istream<charT, traits>&
operator>>(std::basic_istream<charT, traits>& is,
endian_buffer<Order, T, n_bits, Align>& x);
// typedefs
// unaligned big endian signed integer buffers
typedef endian_buffer<order::big, int_least8_t, 8> big_int8_buf_t;
typedef endian_buffer<order::big, int_least16_t, 16> big_int16_buf_t;
typedef endian_buffer<order::big, int_least32_t, 24> big_int24_buf_t;
typedef endian_buffer<order::big, int_least32_t, 32> big_int32_buf_t;
typedef endian_buffer<order::big, int_least64_t, 40> big_int40_buf_t;
typedef endian_buffer<order::big, int_least64_t, 48> big_int48_buf_t;
typedef endian_buffer<order::big, int_least64_t, 56> big_int56_buf_t;
typedef endian_buffer<order::big, int_least64_t, 64> big_int64_buf_t;
// unaligned big endian unsigned integer buffers
typedef endian_buffer<order::big, uint_least8_t, 8> big_uint8_buf_t;
typedef endian_buffer<order::big, uint_least16_t, 16> big_uint16_buf_t;
typedef endian_buffer<order::big, uint_least32_t, 24> big_uint24_buf_t;
typedef endian_buffer<order::big, uint_least32_t, 32> big_uint32_buf_t;
typedef endian_buffer<order::big, uint_least64_t, 40> big_uint40_buf_t;
typedef endian_buffer<order::big, uint_least64_t, 48> big_uint48_buf_t;
typedef endian_buffer<order::big, uint_least64_t, 56> big_uint56_buf_t;
typedef endian_buffer<order::big, uint_least64_t, 64> big_uint64_buf_t;
// unaligned big endian floating point buffers
typedef endian_buffer<order::big, float, 32> big_float32_buf_t;
typedef endian_buffer<order::big, double, 64> big_float64_buf_t;
// unaligned little endian signed integer buffers
typedef endian_buffer<order::little, int_least8_t, 8> little_int8_buf_t;
typedef endian_buffer<order::little, int_least16_t, 16> little_int16_buf_t;
typedef endian_buffer<order::little, int_least32_t, 24> little_int24_buf_t;
typedef endian_buffer<order::little, int_least32_t, 32> little_int32_buf_t;
typedef endian_buffer<order::little, int_least64_t, 40> little_int40_buf_t;
typedef endian_buffer<order::little, int_least64_t, 48> little_int48_buf_t;
typedef endian_buffer<order::little, int_least64_t, 56> little_int56_buf_t;
typedef endian_buffer<order::little, int_least64_t, 64> little_int64_buf_t;
// unaligned little endian unsigned integer buffers
typedef endian_buffer<order::little, uint_least8_t, 8> little_uint8_buf_t;
typedef endian_buffer<order::little, uint_least16_t, 16> little_uint16_buf_t;
typedef endian_buffer<order::little, uint_least32_t, 24> little_uint24_buf_t;
typedef endian_buffer<order::little, uint_least32_t, 32> little_uint32_buf_t;
typedef endian_buffer<order::little, uint_least64_t, 40> little_uint40_buf_t;
typedef endian_buffer<order::little, uint_least64_t, 48> little_uint48_buf_t;
typedef endian_buffer<order::little, uint_least64_t, 56> little_uint56_buf_t;
typedef endian_buffer<order::little, uint_least64_t, 64> little_uint64_buf_t;
// unaligned little endian floating point buffers
typedef endian_buffer<order::little, float, 32> little_float32_buf_t;
typedef endian_buffer<order::little, double, 64> little_float64_buf_t;
// unaligned native endian signed integer types
typedef endian_buffer<order::native, int_least8_t, 8> native_int8_buf_t;
typedef endian_buffer<order::native, int_least16_t, 16> native_int16_buf_t;
typedef endian_buffer<order::native, int_least32_t, 24> native_int24_buf_t;
typedef endian_buffer<order::native, int_least32_t, 32> native_int32_buf_t;
typedef endian_buffer<order::native, int_least64_t, 40> native_int40_buf_t;
typedef endian_buffer<order::native, int_least64_t, 48> native_int48_buf_t;
typedef endian_buffer<order::native, int_least64_t, 56> native_int56_buf_t;
typedef endian_buffer<order::native, int_least64_t, 64> native_int64_buf_t;
// unaligned native endian unsigned integer types
typedef endian_buffer<order::native, uint_least8_t, 8> native_uint8_buf_t;
typedef endian_buffer<order::native, uint_least16_t, 16> native_uint16_buf_t;
typedef endian_buffer<order::native, uint_least32_t, 24> native_uint24_buf_t;
typedef endian_buffer<order::native, uint_least32_t, 32> native_uint32_buf_t;
typedef endian_buffer<order::native, uint_least64_t, 40> native_uint40_buf_t;
typedef endian_buffer<order::native, uint_least64_t, 48> native_uint48_buf_t;
typedef endian_buffer<order::native, uint_least64_t, 56> native_uint56_buf_t;
typedef endian_buffer<order::native, uint_least64_t, 64> native_uint64_buf_t;
// unaligned native endian floating point types
typedef endian_buffer<order::native, float, 32> native_float32_buf_t;
typedef endian_buffer<order::native, double, 64> native_float64_buf_t;
// aligned big endian signed integer buffers
typedef endian_buffer<order::big, int8_t, 8, align::yes> big_int8_buf_at;
typedef endian_buffer<order::big, int16_t, 16, align::yes> big_int16_buf_at;
typedef endian_buffer<order::big, int32_t, 32, align::yes> big_int32_buf_at;
typedef endian_buffer<order::big, int64_t, 64, align::yes> big_int64_buf_at;
// aligned big endian unsigned integer buffers
typedef endian_buffer<order::big, uint8_t, 8, align::yes> big_uint8_buf_at;
typedef endian_buffer<order::big, uint16_t, 16, align::yes> big_uint16_buf_at;
typedef endian_buffer<order::big, uint32_t, 32, align::yes> big_uint32_buf_at;
typedef endian_buffer<order::big, uint64_t, 64, align::yes> big_uint64_buf_at;
// aligned big endian floating point buffers
typedef endian_buffer<order::big, float, 32, align::yes> big_float32_buf_at;
typedef endian_buffer<order::big, double, 64, align::yes> big_float64_buf_at;
// aligned little endian signed integer buffers
typedef endian_buffer<order::little, int8_t, 8, align::yes> little_int8_buf_at;
typedef endian_buffer<order::little, int16_t, 16, align::yes> little_int16_buf_at;
typedef endian_buffer<order::little, int32_t, 32, align::yes> little_int32_buf_at;
typedef endian_buffer<order::little, int64_t, 64, align::yes> little_int64_buf_at;
// aligned little endian unsigned integer buffers
typedef endian_buffer<order::little, uint8_t, 8, align::yes> little_uint8_buf_at;
typedef endian_buffer<order::little, uint16_t, 16, align::yes> little_uint16_buf_at;
typedef endian_buffer<order::little, uint32_t, 32, align::yes> little_uint32_buf_at;
typedef endian_buffer<order::little, uint64_t, 64, align::yes> little_uint64_buf_at;
// aligned little endian floating point buffers
typedef endian_buffer<order::little, float, 32, align::yes> little_float32_buf_at;
typedef endian_buffer<order::little, double, 64, align::yes> little_float64_buf_at;
// aligned native endian typedefs are not provided because
// <cstdint> types are superior for this use case
} // namespace endian
} // namespace boost
The expository data member value_
stores the current value of the
endian_buffer
object as a sequence of bytes ordered as specified by the
Order
template parameter. The CHAR_BIT
macro is defined in <climits>
.
The only supported value of CHAR_BIT
is 8.
The valid values of Nbits
are as follows:
-
When
sizeof(T)
is 1,Nbits
shall be 8; -
When
sizeof(T)
is 2,Nbits
shall be 16; -
When
sizeof(T)
is 4,Nbits
shall be 24 or 32; -
When
sizeof(T)
is 8,Nbits
shall be 40, 48, 56, or 64.
Other values of sizeof(T)
are not supported.
When Nbits
is equal to sizeof(T)*8
, T
must be a trivially copyable type
(such as float
) that is assumed to have the same endianness as uintNbits_t
.
When Nbits
is less than sizeof(T)*8
, T
must be either a standard integral
type (C++std, [basic.fundamental]) or an enum
.
Members
endian_buffer() noexcept = default;
-
- Effects
-
Constructs an uninitialized object.
explicit endian_buffer(T v) noexcept;
-
- Effects
-
endian_store<T, Nbits/8, Order>( value_, v )
.
endian_buffer& operator=(T v) noexcept;
-
- Effects
-
endian_store<T, Nbits/8, Order>( value_, v )
. - Returns
-
*this
.
value_type value() const noexcept;
-
- Returns
-
endian_load<T, Nbits/8, Order>( value_ )
.
unsigned char* data() noexcept;
unsigned char const* data() const noexcept;
-
- Returns
-
A pointer to the first byte of
value_
.
Non-member functions
template <class charT, class traits, order Order, class T,
std::size_t n_bits, align Align>
std::basic_ostream<charT, traits>& operator<<(std::basic_ostream<charT, traits>& os,
const endian_buffer<Order, T, n_bits, Align>& x);
-
- Returns
-
os << x.value()
.
template <class charT, class traits, order Order, class T,
std::size_t n_bits, align A>
std::basic_istream<charT, traits>& operator>>(std::basic_istream<charT, traits>& is,
endian_buffer<Order, T, n_bits, Align>& x);
-
- Effects
-
As if:
T i; if (is >> i) x = i;
- Returns
-
is
.
FAQ
See the Overview FAQ for a library-wide FAQ.
- Why not just use Boost.Serialization?
-
Serialization involves a conversion for every object involved in I/O. Endian integers require no conversion or copying. They are already in the desired format for binary I/O. Thus they can be read or written in bulk.
- Are endian types PODs?
-
Yes for C++11. No for C++03, although several macros are available to force PODness in all cases.
- What are the implications of endian integer types not being PODs with C++03 compilers?
-
They can’t be used in unions. Also, compilers aren’t required to align or lay out storage in portable ways, although this potential problem hasn’t prevented use of Boost.Endian with real compilers.
- What good is native endianness?
-
It provides alignment and size guarantees not available from the built-in types. It eases generic programming.
- Why bother with the aligned endian types?
-
Aligned integer operations may be faster (as much as 10 to 20 times faster) if the endianness and alignment of the type matches the endianness and alignment requirements of the machine. The code, however, is likely to be somewhat less portable than with the unaligned types.
Design considerations for Boost.Endian buffers
-
Must be suitable for I/O - in other words, must be memcpyable.
-
Must provide exactly the size and internal byte ordering specified.
-
Must work correctly when the internal integer representation has more bits that the sum of the bits in the external byte representation. Sign extension must work correctly when the internal integer representation type has more bits than the sum of the bits in the external bytes. For example, using a 64-bit integer internally to represent 40-bit (5 byte) numbers must work for both positive and negative values.
-
Must work correctly (including using the same defined external representation) regardless of whether a compiler treats char as signed or unsigned.
-
Unaligned types must not cause compilers to insert padding bytes.
-
The implementation should supply optimizations with great care. Experience has shown that optimizations of endian integers often become pessimizations when changing machines or compilers. Pessimizations can also happen when changing compiler switches, compiler versions, or CPU models of the same architecture.
C++11
The availability of the C++11
Defaulted
Functions feature is detected automatically, and will be used if present to
ensure that objects of class endian_buffer
are trivial, and thus
PODs.
Compilation
Boost.Endian is implemented entirely within headers, with no need to link to any Boost object libraries.
Several macros allow user control over features:
-
BOOST_ENDIAN_NO_CTORS
causesclass endian_buffer
to have no constructors. The intended use is for compiling user code that must be portable between compilers regardless of C++11 Defaulted Functions support. Use of constructors will always fail, -
BOOST_ENDIAN_FORCE_PODNESS
causesBOOST_ENDIAN_NO_CTORS
to be defined if the compiler does not support C++11 Defaulted Functions. This is ensures that objects ofclass endian_buffer
are PODs, and so can be used in C++03 unions. In C++11,class endian_buffer
objects are PODs, even though they have constructors, so can always be used in unions.
Endian Arithmetic Types
Introduction
Header boost/endian/arithmetic.hpp
provides integer binary types with
control over byte order, value type, size, and alignment. Typedefs provide
easy-to-use names for common configurations.
These types provide portable byte-holders for integer data, independent of particular computer architectures. Use cases almost always involve I/O, either via files or network connections. Although data portability is the primary motivation, these integer byte-holders may also be used to reduce memory use, file size, or network activity since they provide binary integer sizes not otherwise available.
Such integer byte-holder types are traditionally called endian types. See the Wikipedia for a full exploration of endianness, including definitions of big endian and little endian.
Boost endian integers provide the same full set of C++ assignment, arithmetic, and relational operators as C++ standard integral types, with the standard semantics.
Unary arithmetic operators are +
, -
, ~
, !
, plus both prefix and postfix
--
and ++
. Binary arithmetic operators are +
, +=
, -
, -=
, *
,
*=
, /
, /=
, &
, &=
, |
, |=
, ^
, ^=
, <<
, <<=
, >>
, and
>>=
. Binary relational operators are ==
, !=
, <
, <=
, >
, and >=
.
Implicit conversion to the underlying value type is provided. An implicit constructor converting from the underlying value type is provided.
Example
The endian_example.cpp
program writes a binary file containing four-byte,
big-endian and little-endian integers:
#include <iostream>
#include <cstdio>
#include <boost/endian/arithmetic.hpp>
#include <boost/static_assert.hpp>
using namespace boost::endian;
namespace
{
// This is an extract from a very widely used GIS file format.
// Why the designer decided to mix big and little endians in
// the same file is not known. But this is a real-world format
// and users wishing to write low level code manipulating these
// files have to deal with the mixed endianness.
struct header
{
big_int32_t file_code;
big_int32_t file_length;
little_int32_t version;
little_int32_t shape_type;
};
const char* filename = "test.dat";
}
int main(int, char* [])
{
header h;
BOOST_STATIC_ASSERT(sizeof(h) == 16U); // reality check
h.file_code = 0x01020304;
h.file_length = sizeof(header);
h.version = 1;
h.shape_type = 0x01020304;
// Low-level I/O such as POSIX read/write or <cstdio>
// fread/fwrite is sometimes used for binary file operations
// when ultimate efficiency is important. Such I/O is often
// performed in some C++ wrapper class, but to drive home the
// point that endian integers are often used in fairly
// low-level code that does bulk I/O operations, <cstdio>
// fopen/fwrite is used for I/O in this example.
std::FILE* fi = std::fopen(filename, "wb"); // MUST BE BINARY
if (!fi)
{
std::cout << "could not open " << filename << '\n';
return 1;
}
if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
{
std::cout << "write failure for " << filename << '\n';
return 1;
}
std::fclose(fi);
std::cout << "created file " << filename << '\n';
return 0;
}
After compiling and executing endian_example.cpp
, a hex dump of test.dat
shows:
01020304 00000010 01000000 04030201
Notice that the first two 32-bit integers are big endian while the second two are little endian, even though the machine this was compiled and run on was little endian.
Limitations
Requires <climits>
, CHAR_BIT == 8
. If CHAR_BIT
is some other value,
compilation will result in an #error
. This restriction is in place because the
design, implementation, testing, and documentation has only considered issues
related to 8-bit bytes, and there have been no real-world use cases presented
for other sizes.
In C++03, endian_arithmetic
does not meet the requirements for POD types
because it has constructors, private data members, and a base class. This means
that common use cases are relying on unspecified behavior in that the C++
Standard does not guarantee memory layout for non-POD types. This has not been a
problem in practice since all known C++ compilers lay out memory as if
endian
were a POD type. In C++11, it is possible to specify the default
constructor as trivial, and private data members and base classes no longer
disqualify a type from being a POD type. Thus under C++11, endian_arithmetic
will no longer be relying on unspecified behavior.
Feature set
-
Big endian| little endian | native endian byte ordering.
-
Signed | unsigned
-
Unaligned | aligned
-
1-8 byte (unaligned) | 1, 2, 4, 8 byte (aligned)
-
Choice of value type
Enums and typedefs
Two scoped enums are provided:
enum class order { big, little, native };
enum class align { no, yes };
One class template is provided:
template <order Order, typename T, std::size_t n_bits,
align Align = align::no>
class endian_arithmetic;
Typedefs, such as big_int32_t
, provide convenient naming conventions for
common use cases:
Name | Alignment | Endianness | Sign | Sizes in bits (n) |
---|---|---|---|---|
|
no |
big |
signed |
8,16,24,32,40,48,56,64 |
|
no |
big |
unsigned |
8,16,24,32,40,48,56,64 |
|
no |
little |
signed |
8,16,24,32,40,48,56,64 |
|
no |
little |
unsigned |
8,16,24,32,40,48,56,64 |
|
no |
native |
signed |
8,16,24,32,40,48,56,64 |
|
no |
native |
unsigned |
8,16,24,32,40,48,56,64 |
|
yes |
big |
signed |
8,16,32,64 |
|
yes |
big |
unsigned |
8,16,32,64 |
|
yes |
little |
signed |
8,16,32,64 |
|
yes |
little |
unsigned |
8,16,32,64 |
The unaligned types do not cause compilers to insert padding bytes in classes and structs. This is an important characteristic that can be exploited to minimize wasted space in memory, files, and network transmissions.
Caution
|
Code that uses aligned types is possibly non-portable because alignment requirements vary between hardware architectures and because alignment may be affected by compiler switches or pragmas. For example, alignment of an 64-bit integer may be to a 32-bit boundary on a 32-bit machine. Furthermore, aligned types are only available on architectures with 8, 16, 32, and 64-bit integer types. |
Tip
|
Prefer unaligned arithmetic types. |
Tip
|
Protect yourself against alignment ills. For example: |
-
static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong");
Note
|
One-byte arithmetic types have identical layout on all platforms, so they never actually reverse endianness. They are provided to enable generic code, and to improve code readability and searchability. |
Class template endian_arithmetic
endian_arithmetic
is an integer byte-holder with user-specified endianness,
value type, size, and alignment. The usual operations on arithmetic types are
supplied.
Synopsis
#include <boost/endian/buffers.hpp>
namespace boost
{
namespace endian
{
// C++11 features emulated if not available
enum class align { no, yes };
template <order Order, class T, std::size_t n_bits,
align Align = align::no>
class endian_arithmetic
: public endian_buffer<Order, T, n_bits, Align>
{
public:
typedef T value_type;
// if BOOST_ENDIAN_FORCE_PODNESS is defined && C++11 PODs are not
// available then these two constructors will not be present
endian_arithmetic() noexcept = default;
endian_arithmetic(T v) noexcept;
endian_arithmetic& operator=(T v) noexcept;
operator value_type() const noexcept;
value_type value() const noexcept; // for exposition; see endian_buffer
unsigned char* data() noexcept; // for exposition; see endian_buffer
unsigned char const* data() const noexcept; // for exposition; see endian_buffer
// arithmetic operations
// note that additional operations are provided by the value_type
value_type operator+() const noexcept;
endian_arithmetic& operator+=(value_type y) noexcept;
endian_arithmetic& operator-=(value_type y) noexcept;
endian_arithmetic& operator*=(value_type y) noexcept;
endian_arithmetic& operator/=(value_type y) noexcept;
endian_arithmetic& operator%=(value_type y) noexcept;
endian_arithmetic& operator&=(value_type y) noexcept;
endian_arithmetic& operator|=(value_type y) noexcept;
endian_arithmetic& operator^=(value_type y) noexcept;
endian_arithmetic& operator<<=(value_type y) noexcept;
endian_arithmetic& operator>>=(value_type y) noexcept;
endian_arithmetic& operator++() noexcept;
endian_arithmetic& operator--() noexcept;
endian_arithmetic operator++(int) noexcept;
endian_arithmetic operator--(int) noexcept;
// Stream inserter
template <class charT, class traits>
friend std::basic_ostream<charT, traits>&
operator<<(std::basic_ostream<charT, traits>& os, const endian_arithmetic& x);
// Stream extractor
template <class charT, class traits>
friend std::basic_istream<charT, traits>&
operator>>(std::basic_istream<charT, traits>& is, endian_arithmetic& x);
};
// typedefs
// unaligned big endian signed integer types
typedef endian_arithmetic<order::big, int_least8_t, 8> big_int8_t;
typedef endian_arithmetic<order::big, int_least16_t, 16> big_int16_t;
typedef endian_arithmetic<order::big, int_least32_t, 24> big_int24_t;
typedef endian_arithmetic<order::big, int_least32_t, 32> big_int32_t;
typedef endian_arithmetic<order::big, int_least64_t, 40> big_int40_t;
typedef endian_arithmetic<order::big, int_least64_t, 48> big_int48_t;
typedef endian_arithmetic<order::big, int_least64_t, 56> big_int56_t;
typedef endian_arithmetic<order::big, int_least64_t, 64> big_int64_t;
// unaligned big endian unsigned integer types
typedef endian_arithmetic<order::big, uint_least8_t, 8> big_uint8_t;
typedef endian_arithmetic<order::big, uint_least16_t, 16> big_uint16_t;
typedef endian_arithmetic<order::big, uint_least32_t, 24> big_uint24_t;
typedef endian_arithmetic<order::big, uint_least32_t, 32> big_uint32_t;
typedef endian_arithmetic<order::big, uint_least64_t, 40> big_uint40_t;
typedef endian_arithmetic<order::big, uint_least64_t, 48> big_uint48_t;
typedef endian_arithmetic<order::big, uint_least64_t, 56> big_uint56_t;
typedef endian_arithmetic<order::big, uint_least64_t, 64> big_uint64_t;
// unaligned big endian floating point types
typedef endian_arithmetic<order::big, float, 32> big_float32_t;
typedef endian_arithmetic<order::big, double, 64> big_float64_t;
// unaligned little endian signed integer types
typedef endian_arithmetic<order::little, int_least8_t, 8> little_int8_t;
typedef endian_arithmetic<order::little, int_least16_t, 16> little_int16_t;
typedef endian_arithmetic<order::little, int_least32_t, 24> little_int24_t;
typedef endian_arithmetic<order::little, int_least32_t, 32> little_int32_t;
typedef endian_arithmetic<order::little, int_least64_t, 40> little_int40_t;
typedef endian_arithmetic<order::little, int_least64_t, 48> little_int48_t;
typedef endian_arithmetic<order::little, int_least64_t, 56> little_int56_t;
typedef endian_arithmetic<order::little, int_least64_t, 64> little_int64_t;
// unaligned little endian unsigned integer types
typedef endian_arithmetic<order::little, uint_least8_t, 8> little_uint8_t;
typedef endian_arithmetic<order::little, uint_least16_t, 16> little_uint16_t;
typedef endian_arithmetic<order::little, uint_least32_t, 24> little_uint24_t;
typedef endian_arithmetic<order::little, uint_least32_t, 32> little_uint32_t;
typedef endian_arithmetic<order::little, uint_least64_t, 40> little_uint40_t;
typedef endian_arithmetic<order::little, uint_least64_t, 48> little_uint48_t;
typedef endian_arithmetic<order::little, uint_least64_t, 56> little_uint56_t;
typedef endian_arithmetic<order::little, uint_least64_t, 64> little_uint64_t;
// unaligned little endian floating point types
typedef endian_arithmetic<order::little, float, 32> little_float32_t;
typedef endian_arithmetic<order::little, double, 64> little_float64_t;
// unaligned native endian signed integer types
typedef endian_arithmetic<order::native, int_least8_t, 8> native_int8_t;
typedef endian_arithmetic<order::native, int_least16_t, 16> native_int16_t;
typedef endian_arithmetic<order::native, int_least32_t, 24> native_int24_t;
typedef endian_arithmetic<order::native, int_least32_t, 32> native_int32_t;
typedef endian_arithmetic<order::native, int_least64_t, 40> native_int40_t;
typedef endian_arithmetic<order::native, int_least64_t, 48> native_int48_t;
typedef endian_arithmetic<order::native, int_least64_t, 56> native_int56_t;
typedef endian_arithmetic<order::native, int_least64_t, 64> native_int64_t;
// unaligned native endian unsigned integer types
typedef endian_arithmetic<order::native, uint_least8_t, 8> native_uint8_t;
typedef endian_arithmetic<order::native, uint_least16_t, 16> native_uint16_t;
typedef endian_arithmetic<order::native, uint_least32_t, 24> native_uint24_t;
typedef endian_arithmetic<order::native, uint_least32_t, 32> native_uint32_t;
typedef endian_arithmetic<order::native, uint_least64_t, 40> native_uint40_t;
typedef endian_arithmetic<order::native, uint_least64_t, 48> native_uint48_t;
typedef endian_arithmetic<order::native, uint_least64_t, 56> native_uint56_t;
typedef endian_arithmetic<order::native, uint_least64_t, 64> native_uint64_t;
// unaligned native endian floating point types
typedef endian_arithmetic<order::native, float, 32> native_float32_t;
typedef endian_arithmetic<order::native, double, 64> native_float64_t;
// aligned big endian signed integer types
typedef endian_arithmetic<order::big, int8_t, 8, align::yes> big_int8_at;
typedef endian_arithmetic<order::big, int16_t, 16, align::yes> big_int16_at;
typedef endian_arithmetic<order::big, int32_t, 32, align::yes> big_int32_at;
typedef endian_arithmetic<order::big, int64_t, 64, align::yes> big_int64_at;
// aligned big endian unsigned integer types
typedef endian_arithmetic<order::big, uint8_t, 8, align::yes> big_uint8_at;
typedef endian_arithmetic<order::big, uint16_t, 16, align::yes> big_uint16_at;
typedef endian_arithmetic<order::big, uint32_t, 32, align::yes> big_uint32_at;
typedef endian_arithmetic<order::big, uint64_t, 64, align::yes> big_uint64_at;
// aligned big endian floating point types
typedef endian_arithmetic<order::big, float, 32, align::yes> big_float32_at;
typedef endian_arithmetic<order::big, double, 64, align::yes> big_float64_at;
// aligned little endian signed integer types
typedef endian_arithmetic<order::little, int8_t, 8, align::yes> little_int8_at;
typedef endian_arithmetic<order::little, int16_t, 16, align::yes> little_int16_at;
typedef endian_arithmetic<order::little, int32_t, 32, align::yes> little_int32_at;
typedef endian_arithmetic<order::little, int64_t, 64, align::yes> little_int64_at;
// aligned little endian unsigned integer types
typedef endian_arithmetic<order::little, uint8_t, 8, align::yes> little_uint8_at;
typedef endian_arithmetic<order::little, uint16_t, 16, align::yes> little_uint16_at;
typedef endian_arithmetic<order::little, uint32_t, 32, align::yes> little_uint32_at;
typedef endian_arithmetic<order::little, uint64_t, 64, align::yes> little_uint64_at;
// aligned little endian floating point types
typedef endian_arithmetic<order::little, float, 32, align::yes> little_float32_at;
typedef endian_arithmetic<order::little, double, 64, align::yes> little_float64_at;
// aligned native endian typedefs are not provided because
// <cstdint> types are superior for that use case
} // namespace endian
} // namespace boost
The only supported value of CHAR_BIT
is 8.
The valid values of Nbits
are as follows:
-
When
sizeof(T)
is 1,Nbits
shall be 8; -
When
sizeof(T)
is 2,Nbits
shall be 16; -
When
sizeof(T)
is 4,Nbits
shall be 24 or 32; -
When
sizeof(T)
is 8,Nbits
shall be 40, 48, 56, or 64.
Other values of sizeof(T)
are not supported.
When Nbits
is equal to sizeof(T)*8
, T
must be a standard arithmetic type.
When Nbits
is less than sizeof(T)*8
, T
must be a standard integral type
(C++std, [basic.fundamental]) that is not bool
.
Members
endian_arithmetic() noexcept = default; // C++03: endian(){}
-
- Effects
-
Constructs an uninitialized object.
endian_arithmetic(T v) noexcept;
-
- Effects
-
See
endian_buffer::endian_buffer(T)
.
endian_arithmetic& operator=(T v) noexcept;
-
- Effects
-
See
endian_buffer::operator=(T)
. - Returns
-
*this
.
operator T() const noexcept;
-
- Returns
-
value()
.
Other operators
Other operators on endian objects are forwarded to the equivalent operator on
value_type
.
Stream inserter
template <class charT, class traits>
friend std::basic_ostream<charT, traits>&
operator<<(std::basic_ostream<charT, traits>& os, const endian_arithmetic& x);
-
- Returns
-
os << +x
.
Stream extractor
template <class charT, class traits>
friend std::basic_istream<charT, traits>&
operator>>(std::basic_istream<charT, traits>& is, endian_arithmetic& x);
-
- Effects
-
As if:
T i; if (is >> i) x = i;
- Returns
-
is
.
FAQ
See the Overview FAQ for a library-wide FAQ.
- Why not just use Boost.Serialization?
-
Serialization involves a conversion for every object involved in I/O. Endian integers require no conversion or copying. They are already in the desired format for binary I/O. Thus they can be read or written in bulk.
- Are endian types PODs?
-
Yes for C++11. No for C++03, although several macros are available to force PODness in all cases.
- What are the implications of endian integer types not being PODs with C++03 compilers?
-
They can’t be used in unions. Also, compilers aren’t required to align or lay out storage in portable ways, although this potential problem hasn’t prevented use of Boost.Endian with real compilers.
- What good is native endianness?
-
It provides alignment and size guarantees not available from the built-in types. It eases generic programming.
- Why bother with the aligned endian types?
-
Aligned integer operations may be faster (as much as 10 to 20 times faster) if the endianness and alignment of the type matches the endianness and alignment requirements of the machine. The code, however, will be somewhat less portable than with the unaligned types.
- Why provide the arithmetic operations?
-
Providing a full set of operations reduces program clutter and makes code both easier to write and to read. Consider incrementing a variable in a record. It is very convenient to write:
++record.foo;
Rather than:
int temp(record.foo); ++temp; record.foo = temp;
Design considerations for Boost.Endian types
-
Must be suitable for I/O - in other words, must be memcpyable.
-
Must provide exactly the size and internal byte ordering specified.
-
Must work correctly when the internal integer representation has more bits that the sum of the bits in the external byte representation. Sign extension must work correctly when the internal integer representation type has more bits than the sum of the bits in the external bytes. For example, using a 64-bit integer internally to represent 40-bit (5 byte) numbers must work for both positive and negative values.
-
Must work correctly (including using the same defined external representation) regardless of whether a compiler treats char as signed or unsigned.
-
Unaligned types must not cause compilers to insert padding bytes.
-
The implementation should supply optimizations with great care. Experience has shown that optimizations of endian integers often become pessimizations when changing machines or compilers. Pessimizations can also happen when changing compiler switches, compiler versions, or CPU models of the same architecture.
Experience
Classes with similar functionality have been independently developed by several Boost programmers and used very successful in high-value, high-use applications for many years. These independently developed endian libraries often evolved from C libraries that were also widely used. Endian types have proven widely useful across a wide range of computer architectures and applications.
Motivating use cases
Neil Mayhew writes: "I can also provide a meaningful use-case for this library: reading TrueType font files from disk and processing the contents. The data format has fixed endianness (big) and has unaligned values in various places. Using Boost.Endian simplifies and cleans the code wonderfully."
C++11
The availability of the C++11
Defaulted
Functions feature is detected automatically, and will be used if present to
ensure that objects of class endian_arithmetic
are trivial, and thus PODs.
Compilation
Boost.Endian is implemented entirely within headers, with no need to link to any Boost object libraries.
Several macros allow user control over features:
-
BOOST_ENDIAN_NO_CTORS causes
class endian_arithmetic
to have no constructors. The intended use is for compiling user code that must be portable between compilers regardless of C++11 Defaulted Functions support. Use of constructors will always fail, -
BOOST_ENDIAN_FORCE_PODNESS causes BOOST_ENDIAN_NO_CTORS to be defined if the compiler does not support C++11 Defaulted Functions. This is ensures that objects of
class endian_arithmetic
are PODs, and so can be used in C++03 unions. In C++11,class endian_arithmetic
objects are PODs, even though they have constructors, so can always be used in unions.
Acknowledgements
Original design developed by Darin Adler based on classes developed by Mark
Borgerding. Four original class templates combined into a single
endian_arithmetic
class template by Beman Dawes, who put the library together,
provided documentation, added the typedefs, and also added the
unrolled_byte_loops
sign partial specialization to correctly extend the sign
when cover integer size differs from endian representation size.
Appendix A: History and Acknowledgments
History
Changes requested by formal review
The library was reworked from top to bottom to accommodate changes requested during the formal review. The issues that were required to be resolved before a mini-review are shown in bold below, with the resolution indicated.
- Common use case scenarios should be developed.
-
Done. The documentation have been refactored. A page is now devoted to Choosing the Approach to endianness. See Use cases for use case scenarios.
- Example programs should be developed for the common use case scenarios.
-
Done. See Choosing the Approach. Example code has been added throughout.
- Documentation should illuminate the differences between endian integer/float type and endian conversion approaches to the common use case scenarios, and provide guidelines for choosing the most appropriate approach in user’s applications.
-
Done. See Choosing the Approach.
- Conversion functions supplying results via return should be provided.
-
Done. See Conversion Functions.
- Platform specific performance enhancements such as use of compiler intrinsics or relaxed alignment requirements should be supported.
-
Done. Compiler (Clang, GCC, VisualC++, etc.) intrinsics and built-in functions are used in the implementation where appropriate, as requested. See Built-in support for Intrinsics. See Timings for Example 2 to gauge the impact of intrinsics.
- Endian integer (and floating) types should be implemented via the conversion functions. If that can’t be done efficiently, consideration should be given to expanding the conversion function signatures to resolve the inefficiencies.
-
Done. For the endian types, the implementation uses the endian conversion functions, and thus the intrinsics, as requested.
- Benchmarks that measure performance should be provided. It should be possible to compare platform specific performance enhancements against portable base implementations, and to compare endian integer approaches against endian conversion approaches for the common use case scenarios.
-
Done. See Timings for Example 2. The
endian/test
directory also contains several additional benchmark and speed test programs. - Float (32-bits) and double (64-bits) should be supported. IEEE 754 is the primary use case.
-
Done. The endian buffer types, endian arithmetic types and endian conversion functions now support 32-bit
(float)
and 64-bit(double)
floating point, as requested.
Note
|
This answer is outdated. The support for float and double was subsequently found
problematic and has been removed. Recently, support for float and double has
been reinstated for endian_buffer and endian_arithmetic , but not for the
conversion functions.
|
- Support for user defined types (UDTs) is desirable, and should be provided where there would be no conflict with the other concerns.
-
Done. See Customization points for user-defined types (UDTs).
- There is some concern that endian integer/float arithmetic operations might used inadvertently or inappropriately. The impact of adding an endian_buffer class without arithmetic operations should be investigated.
-
Done. The endian types have been decomposed into class template
endian_buffer
and class templateendian_arithmetic
. Classendian_buffer
is a public base class forendian_arithmetic
, and can also be used by users as a stand-alone class. - Stream insertion and extraction of the endian integer/float types should be documented and included in the test coverage.
-
Done. See Stream inserter and Stream extractor.
- Binary I/O support that was investigated during development of the Endian library should be put up for mini-review for inclusion in the Boost I/O library.
-
Not done yet. Will be handled as a separate min-review soon after the Endian mini-review.
- Other requested changes.
-
-
In addition to the named-endianness conversion functions, functions that perform compile-time (via template) and run-time (via function argument) dispatch are now provided.
-
order::native
is now a synonym fororder::big
ororder::little
according to the endianness of the platform. This reduces the number of template specializations required. -
Headers have been reorganized to make them easier to read, with a synopsis at the front and implementation following.
-
Other changes since formal review
-
Header
boost/endian/endian.hpp
has been renamed toboost/endian/arithmetic.hpp
. Headersboost/endian/conversion.hpp
andboost/endian/buffers.hpp
have been added. Infrastructure file names were changed accordingly. -
The endian arithmetic type aliases have been renamed, using a naming pattern that is consistent for both integer and floating point, and a consistent set of aliases supplied for the endian buffer types.
-
The unaligned-type alias names still have the
_t
suffix, but the aligned-type alias names now have an_at
suffix. -
endian_reverse()
overloads forint8_t
anduint8_t
have been added for improved generality. (Pierre Talbot) -
Overloads of
endian_reverse_inplace()
have been replaced with a singleendian_reverse_inplace()
template. (Pierre Talbot) -
For X86 and X64 architectures, which permit unaligned loads and stores, unaligned little endian buffer and arithmetic types use regular loads and stores when the size is exact. This makes unaligned little endian buffer and arithmetic types significantly more efficient on these architectures. (Jeremy Maitin-Shepard)
-
C++11 features affecting interfaces, such as
noexcept
, are now used. C++03 compilers are still supported. -
Acknowledgements have been updated.
Compatibility with interim releases
Prior to the official Boost release, class template endian_arithmetic
has been
used for a decade or more with the same functionality but under the name
endian
. Other names also changed in the official release. If the macro
BOOST_ENDIAN_DEPRECATED_NAMES
is defined, those old now deprecated names are
still supported. However, the class template endian
name is only provided for
compilers supporting C++11 template aliases. For C++03 compilers, the name
will have to be changed to endian_arithmetic
.
To support backward header compatibility, deprecated header
boost/endian/endian.hpp
forwards to boost/endian/arithmetic.hpp
. It requires
BOOST_ENDIAN_DEPRECATED_NAMES
be defined. It should only be used while
transitioning to the official Boost release of the library as it will be removed
in some future release.
Future directions
- Standardization.
-
The plan is to submit Boost.Endian to the C++ standards committee for possible inclusion in a Technical Specification or the C++ standard itself.
- Specializations for
numeric_limits
. -
Roger Leigh requested that all
boost::endian
types providenumeric_limits
specializations. See GitHub issue 4. - Character buffer support.
-
Peter Dimov pointed out during the mini-review that getting and setting basic arithmetic types (or
<cstdint>
equivalents) from/to an offset into an array of unsigned char is a common need. See Boost.Endian mini-review posting. - Out-of-range detection.
-
Peter Dimov pointed suggested during the mini-review that throwing an exception on buffer values being out-of-range might be desirable. See the end of this posting and subsequent replies.
Acknowledgements
Comments and suggestions were received from Adder, Benaka Moorthi, Christopher Kohlhoff, Cliff Green, Daniel James, Dave Handley, Gennaro Proto, Giovanni Piero Deretta, Gordon Woodhull, dizzy, Hartmut Kaiser, Howard Hinnant, Jason Newton, Jeff Flinn, Jeremy Maitin-Shepard, John Filo, John Maddock, Kim Barrett, Marsh Ray, Martin Bonner, Mathias Gaunard, Matias Capeletto, Neil Mayhew, Nevin Liber, Olaf van der Spek, Paul Bristow, Peter Dimov, Pierre Talbot, Phil Endecott, Philip Bennefall, Pyry Jahkola, Rene Rivera, Robert Stewart, Roger Leigh, Roland Schwarz, Scott McMurray, Sebastian Redl, Tim Blechmann, Tim Moore, tymofey, Tomas Puverle, Vincente Botet, Yuval Ronen and Vitaly Budovsk. Apologies if anyone has been missed.
The documentation was converted into Asciidoc format by Glen Fernandes.
Appendix B: Copyright and License
This documentation is
-
Copyright 2011-2016 Beman Dawes
-
Copyright 2019 Peter Dimov
and is distributed under the Boost Software License, Version 1.0.