...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
Proto v4 is merged to Boost trunk with more powerful transform protocol.
Proto is accepted into Boost.
Proto's Boost review begins.
Boost.Proto v3 brings separation of grammars and transforms and a "round" lambda syntax for defining transforms in-place.
Boost.Xpressive is ported from Proto compilers to Proto transforms. Support for old Proto compilers is dropped.
Preliminary submission of Proto to Boost.
The idea for transforms that decorate grammar rules is born in a private email discussion with Joel de Guzman and Hartmut Kaiser. The first transforms are committed to CVS 5 days later on December 16.
The idea for proto::matches<>
and the whole grammar facility is hatched during a discussion with Hartmut
Kaiser on the spirit-devel list. The first version of proto::matches<>
is checked into CVS 3 days later.
Message is here.
Proto is reborn, this time with a uniform expression types that are POD. Announcement is here.
Proto is born as a major refactorization of Boost.Xpressive's meta-programming. Proto offers expression types, operator overloads and "compilers", an early formulation of what later became transforms. Announcement is here.
Proto expression types are PODs (Plain Old Data), and do not have constructors. They are brace-initialized, as follows:
terminal<int>::type const _i = {1};
The reason is so that expression objects like _i
above can be statically initialized. Why is static
initialization important? The terminals of many domain- specific embedded
languages are likely to be global const objects, like _1
and _2
from the Boost Lambda
Library. Were these object to require run-time initialization, it might
be possible to use these objects before they are initialized. That would
be bad. Statically initialized objects cannot be misused that way.
Anyone who has peeked at Proto's source code has probably wondered, "Why all the dirty preprocessor gunk? Couldn't this have been all implemented cleanly on top of libraries like MPL and Fusion?" The answer is that Proto could have been implemented this way, and in fact was at one point. The problem is that template metaprogramming (TMP) makes for longer compile times. As a foundation upon which other TMP-heavy libraries will be built, Proto itself should be as lightweight as possible. That is achieved by prefering preprocessor metaprogramming to template metaprogramming. Expanding a macro is far more efficient than instantiating a template. In some cases, the "clean" version takes 10x longer to compile than the "dirty" version.
The "clean and slow" version of Proto can still be found at http://svn.boost.org/svn/boost/branches/proto/v3. Anyone who is interested can download it and verify that it is, in fact, unusably slow to compile. Note that this branch's development was abandoned, and it does not conform exactly with Proto's current interface.
Much has already been written about dispatching on type traits using SFINAE
(Substitution Failure Is Not An Error) techniques in C++. There is a Boost
library, Boost.Enable_if, to make the technique idiomatic. Proto dispatches
on type traits extensively, but it doesn't use enable_if<>
very often. Rather, it dispatches
based on the presence or absence of nested types, often typedefs for void.
Consider the implementation of is_expr<>
. It could have been written as
something like this:
template<typename T> struct is_expr : is_base_and_derived<proto::some_expr_base, T> {};
Rather, it is implemented as this:
template<typename T, typename Void = void> struct is_expr : mpl::false_ {}; template<typename T> struct is_expr<T, typename T::proto_is_expr_> : mpl::true_ {};
This relies on the fact that the specialization will be preferred if T
has a nested proto_is_expr_
that is a typedef for void
.
All Proto expression types have such a nested typedef.
Why does Proto do it this way? The reason is because, after running extensive
benchmarks while trying to improve compile times, I have found that this
approach compiles faster. It requires exactly one template instantiation.
The other approach requires at least 2: is_expr<>
and is_base_and_derived<>
, plus whatever templates is_base_and_derived<>
may instantiate.
In several places, Proto needs to know whether or not a function object
Fun
can be called with
certain parameters and take a fallback action if not. This happens in
proto::callable_context<>
and in the proto::call<>
transform. How does
Proto know? It involves some tricky metaprogramming. Here's how.
Another way of framing the question is by trying to implement the following
can_be_called<>
Boolean metafunction, which checks to see if a function object Fun
can be called with parameters of
type A
and B
:
template<typename Fun, typename A, typename B> struct can_be_called;
First, we define the following dont_care
struct, which has an implicit conversion from anything. And not just any
implicit conversion; it has a ellipsis conversion, which is the worst possible
conversion for the purposes of overload resolution:
struct dont_care { dont_care(...); };
We also need some private type known only to us with an overloaded comma operator (!), and some functions that detect the presence of this type and return types with different sizes, as follows:
struct private_type { private_type const &operator,(int) const; }; typedef char yes_type; // sizeof(yes_type) == 1 typedef char (&no_type)[2]; // sizeof(no_type) == 2 template<typename T> no_type is_private_type(T const &); yes_type is_private_type(private_type const &);
Next, we implement a binary function object wrapper with a very strange conversion operator, whose meaning will become clear later.
template<typename Fun> struct funwrap2 : Fun { funwrap2(); typedef private_type const &(*pointer_to_function)(dont_care, dont_care); operator pointer_to_function() const; };
With all of these bits and pieces, we can implement can_be_called<>
as follows:
template<typename Fun, typename A, typename B> struct can_be_called { static funwrap2<Fun> &fun; static A &a; static B &b; static bool const value = ( sizeof(no_type) == sizeof(is_private_type( (fun(a,b), 0) )) ); typedef mpl::bool_<value> type; };
The idea is to make it so that fun(a,b)
will
always compile by adding our own binary function overload, but doing it
in such a way that we can detect whether our overload was selected or not.
And we rig it so that our overload is selected if there is really no better
option. What follows is a description of how can_be_called<>
works.
We wrap Fun
in a type that
has an implicit conversion to a pointer to a binary function. An object
fun
of class type can be
invoked as fun(a, b)
if it has such a conversion operator,
but since it involves a user-defined conversion operator, it is less preferred
than an overloaded operator()
, which requires no such conversion.
The function pointer can accept any two arguments by virtue of the dont_care
type. The conversion sequence
for each argument is guaranteed to be the worst possible conversion sequence:
an implicit conversion through an ellipsis, and a user-defined conversion
to dont_care
. In total,
it means that funwrap2<Fun>()(a, b)
will always compile, but it will select our overload only if there really
is no better option.
If there is a better option --- for example if Fun
has an overloaded function call operator such as void
operator()(A a, B b)
---
then fun(a, b)
will resolve to that one instead. The
question now is how to detect which function got picked by overload resolution.
Notice how fun(a, b)
appears in can_be_called<>
: (fun(a, b), 0)
.
Why do we use the comma operator there? The reason is because we are using
this expression as the argument to a function. If the return type of fun(a, b)
is void
,
it cannot legally be used as an argument to a function. The comma operator
sidesteps the issue.
This should also make plain the purpose of the overloaded comma operator
in private_type
. The return
type of the pointer to function is private_type
.
If overload resolution selects our overload, then the type of (fun(a,
b),
0)
is private_type
. Otherwise,
it is int
. That fact is used
to dispatch to either overload of is_private_type()
, which encodes its answer in the size
of its return type.
That's how it works with binary functions. Now repeat the above process for functions up to some predefined function arity, and you're done.
I'd like to thank Joel de Guzman and Hartmut Kaiser for being willing to take a chance on using Proto for their work on Spirit-2 and Karma when Proto was little more than a vision. Their requirements and feedback have been indespensable.
Thanks also to the developers of PETE. I found many good ideas there.