Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for an old version of Boost. Click here to view this page for the latest version.
PrevUpHomeNext

Appendices

Appendix A: History
Appendix B: Rationale
Appendix C: Implementation Notes
Appendix D: Acknowledgements

August 11, 2008

Proto v4 is merged to Boost trunk with more powerful transform protocol.

April 7, 2008

Proto is accepted into Boost.

March 1, 2008

Proto's Boost review begins.

January 11, 2008

Boost.Proto v3 brings separation of grammars and transforms and a "round" lambda syntax for defining transforms in-place.

April 15, 2007

Boost.Xpressive is ported from Proto compilers to Proto transforms. Support for old Proto compilers is dropped.

April 4, 2007

Preliminary submission of Proto to Boost.

December 11, 2006

The idea for transforms that decorate grammar rules is born in a private email discussion with Joel de Guzman and Hartmut Kaiser. The first transforms are committed to CVS 5 days later on December 16.

November 1, 2006

The idea for proto::matches<> and the whole grammar facility is hatched during a discussion with Hartmut Kaiser on the spirit-devel list. The first version of proto::matches<> is checked into CVS 3 days later. Message is here.

October 28, 2006

Proto is reborn, this time with a uniform expression types that are POD. Announcement is here.

April 20, 2005

Proto is born as a major refactorization of Boost.Xpressive's meta-programming. Proto offers expression types, operator overloads and "compilers", an early formulation of what later became transforms. Announcement is here.

Proto expression types are PODs (Plain Old Data), and do not have constructors. They are brace-initialized, as follows:

terminal<int>::type const _i = {1};

The reason is so that expression objects like _i above can be statically initialized. Why is static initialization important? The terminals of many domain- specific embedded languages are likely to be global const objects, like _1 and _2 from the Boost Lambda Library. Were these object to require run-time initialization, it might be possible to use these objects before they are initialized. That would be bad. Statically initialized objects cannot be misused that way.

Anyone who has peeked at Proto's source code has probably wondered, "Why all the dirty preprocessor gunk? Couldn't this have been all implemented cleanly on top of libraries like MPL and Fusion?" The answer is that Proto could have been implemented this way, and in fact was at one point. The problem is that template metaprogramming (TMP) makes for longer compile times. As a foundation upon which other TMP-heavy libraries will be built, Proto itself should be as lightweight as possible. That is achieved by prefering preprocessor metaprogramming to template metaprogramming. Expanding a macro is far more efficient than instantiating a template. In some cases, the "clean" version takes 10x longer to compile than the "dirty" version.

The "clean and slow" version of Proto can still be found at http://svn.boost.org/svn/boost/branches/proto/v3. Anyone who is interested can download it and verify that it is, in fact, unusably slow to compile. Note that this branch's development was abandoned, and it does not conform exactly with Proto's current interface.

Much has already been written about dispatching on type traits using SFINAE (Substitution Failure Is Not An Error) techniques in C++. There is a Boost library, Boost.Enable_if, to make the technique idiomatic. Proto dispatches on type traits extensively, but it doesn't use enable_if<> very often. Rather, it dispatches based on the presence or absence of nested types, often typedefs for void.

Consider the implementation of is_expr<>. It could have been written as something like this:

template<typename T>
struct is_expr
  : is_base_and_derived<proto::some_expr_base, T>
{};

Rather, it is implemented as this:

template<typename T, typename Void = void>
struct is_expr
  : mpl::false_
{};

template<typename T>
struct is_expr<T, typename T::proto_is_expr_>
  : mpl::true_
{};

This relies on the fact that the specialization will be preferred if T has a nested proto_is_expr_ that is a typedef for void. All Proto expression types have such a nested typedef.

Why does Proto do it this way? The reason is because, after running extensive benchmarks while trying to improve compile times, I have found that this approach compiles faster. It requires exactly one template instantiation. The other approach requires at least 2: is_expr<> and is_base_and_derived<>, plus whatever templates is_base_and_derived<> may instantiate.

In several places, Proto needs to know whether or not a function object Fun can be called with certain parameters and take a fallback action if not. This happens in proto::callable_context<> and in the proto::call<> transform. How does Proto know? It involves some tricky metaprogramming. Here's how.

Another way of framing the question is by trying to implement the following can_be_called<> Boolean metafunction, which checks to see if a function object Fun can be called with parameters of type A and B:

template<typename Fun, typename A, typename B>
struct can_be_called;

First, we define the following dont_care struct, which has an implicit conversion from anything. And not just any implicit conversion; it has a ellipsis conversion, which is the worst possible conversion for the purposes of overload resolution:

struct dont_care
{
    dont_care(...);
};

We also need some private type known only to us with an overloaded comma operator (!), and some functions that detect the presence of this type and return types with different sizes, as follows:

struct private_type
{
    private_type const &operator,(int) const;
};

typedef char yes_type;      // sizeof(yes_type) == 1
typedef char (&no_type)[2]; // sizeof(no_type)  == 2

template<typename T>
no_type is_private_type(T const &);

yes_type is_private_type(private_type const &);

Next, we implement a binary function object wrapper with a very strange conversion operator, whose meaning will become clear later.

template<typename Fun>
struct funwrap2 : Fun
{
    funwrap2();
    typedef private_type const &(*pointer_to_function)(dont_care, dont_care);
    operator pointer_to_function() const;
};

With all of these bits and pieces, we can implement can_be_called<> as follows:

template<typename Fun, typename A, typename B>
struct can_be_called
{
    static funwrap2<Fun> &fun;
    static A &a;
    static B &b;

    static bool const value = (
        sizeof(no_type) == sizeof(is_private_type( (fun(a,b), 0) ))
    );

    typedef mpl::bool_<value> type;
};

The idea is to make it so that fun(a,b) will always compile by adding our own binary function overload, but doing it in such a way that we can detect whether our overload was selected or not. And we rig it so that our overload is selected if there is really no better option. What follows is a description of how can_be_called<> works.

We wrap Fun in a type that has an implicit conversion to a pointer to a binary function. An object fun of class type can be invoked as fun(a, b) if it has such a conversion operator, but since it involves a user-defined conversion operator, it is less preferred than an overloaded operator(), which requires no such conversion.

The function pointer can accept any two arguments by virtue of the dont_care type. The conversion sequence for each argument is guaranteed to be the worst possible conversion sequence: an implicit conversion through an ellipsis, and a user-defined conversion to dont_care. In total, it means that funwrap2<Fun>()(a, b) will always compile, but it will select our overload only if there really is no better option.

If there is a better option --- for example if Fun has an overloaded function call operator such as void operator()(A a, B b) --- then fun(a, b) will resolve to that one instead. The question now is how to detect which function got picked by overload resolution.

Notice how fun(a, b) appears in can_be_called<>: (fun(a, b), 0). Why do we use the comma operator there? The reason is because we are using this expression as the argument to a function. If the return type of fun(a, b) is void, it cannot legally be used as an argument to a function. The comma operator sidesteps the issue.

This should also make plain the purpose of the overloaded comma operator in private_type. The return type of the pointer to function is private_type. If overload resolution selects our overload, then the type of (fun(a, b), 0) is private_type. Otherwise, it is int. That fact is used to dispatch to either overload of is_private_type(), which encodes its answer in the size of its return type.

That's how it works with binary functions. Now repeat the above process for functions up to some predefined function arity, and you're done.

I'd like to thank Joel de Guzman and Hartmut Kaiser for being willing to take a chance on using Proto for their work on Spirit-2 and Karma when Proto was little more than a vision. Their requirements and feedback have been indespensable.

Thanks also to the developers of PETE. I found many good ideas there.


PrevUpHomeNext