...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
The Spirit.Qi subrule
is a component allowing to create a named parser, and to refer to it by
name -- much like rules and grammars. It is in fact a fully static version
of the rule.
The strength of subrules is performance. Replacing some rules with subrules can make a parser slightly faster (see Performance below for measurements). The reason is that subrules allow aggressive inlining by the C++ compiler, whereas the implementation of rules is based on a virtual function call which, depending on the compiler, can have some run-time overhead and stop inlining.
The weaknesses of subrules are:
entry = ( expression = term >> *( ('+' >> term) | ('-' >> term) ) , term = factor >> *( ('*' >> factor) | ('/' >> factor) ) , factor = uint_ | '(' >> expression >> ')' | ('-' >> factor) | ('+' >> factor) );
The example above can be found here: ../../example/qi/calc1_sr.cpp
As shown in this code snippet (an extract from the calc1_sr example), subrules
can be freely mixed with rules and grammars. Here, a group of 3 subrules
(expression
, term
, factor
)
is assigned to a rule (named entry
).
This means that parts of a parser can use subrules (typically the innermost,
most performance-critical parts), whereas the rest can use rules and grammars.
// forwards to <boost/spirit/repository/home/qi/nonterminal/subrule.hpp> #include <boost/spirit/repository/include/qi_subrule.hpp>
subrule<ID, A1, A2> sr(name);
Parameter |
Description |
---|---|
|
Required numeric argument. Gives the subrule a unique 'identification tag'. |
|
Optional types, can be specified in any order. Can be one of 1. signature, 2. locals (see rules reference for more information on those parameters). Note that the skipper type need not be specified in the parameters, unlike with grammars and rules. Subrules will automatically use the skipper type which is in effect when they are invoked. |
|
Optional string. Gives the subrule a name, useful for debugging and error handling. |
Subrules are defined and used within groups, typically (and by convention) enclosed inside parentheses.
// Group containing N subrules ( sr1 = expr1 , sr2 = expr2 , ... // Any number of subrules }
The IDs of all subrules defined within the same group must be different. It is an error to define several subrules with the same ID (or to define the same subrule multiple times) in the same group.
// Auto-subrules and inherited attributes ( srA %= exprA >> srB >> srC(c1, c2, ...) // Arguments to subrule srC , srB %= exprB , srC = exprC , ... )(a1, a2, ...) // Arguments to group, i.e. to start subrule srA
Parameter |
Description |
---|---|
|
Subrules with different IDs. |
|
Parser expressions. Can include |
|
Subrule with a synthesized attribute and inherited attributes. |
|
Subrule with a synthesized attribute. |
|
Subrule with inherited attributes. |
|
Parser expressions. |
|
Arguments passed to the subrule group. They are passed as inherited
attributes to the group's start subrule, |
|
Arguments passed as inherited attributes to subrule |
A subrule group (a set of subrule definitions) is a parser, which can be
used anywhere in a parser expression (in assignments to rules, as well
as directly in arguments to functions such as parse
).
In a group, parsing proceeds from the start subrule, which is the first
(topmost) subrule defined in that group. In the two groups in the synopsis
above, sr1
and srA
are the start subrules respectively
-- for example when the first subrule group is called forth, the sr1
subrule is called.
A subrule can only be used in a group which defines it. Groups can be viewed as scopes: a definition of a subrule is limited to its enclosing group.
rule<char const*> r1, r2, r3; subrule<1> sr1; subrule<2> sr2; r1 = ( sr1 = 'a' >> int_ ) // First group in r1. >> ( sr2 = +sr1 ) // Second group in r1. // ^^^ // DOES NOT COMPILE: sr1 is not defined in this // second group, it cannot be used here (its // previous definition is out of scope). ; r2 = ( sr1 = 'a' >> int_ ) // Only group in r2. >> sr1 // ^^^ // DOES NOT COMPILE: not in a subrule group, // sr1 cannot be used here (here too, its // previous definition is out of scope). ; r3 = ( sr1 = 'x' >> double_ ) // Another group. The same subrule `sr1` // can have another, independent // definition in this group. ;
A subrule has the same behavior as a rule with respect to attributes. In particular:
unused_type
.
%=
syntax. In this case, the RHS parser's
attribute is automatically propagated to the subrule's synthesized
attribute.
_val
,
_r1
, _r2
, ... are available to refer to
the subrule's synthesized and inherited attributes, if present.
A subrule has the same behavior as a rule with respect to locals. In particular,
the Phoenix placeholders _a
,
_b
, ... are available to
refer to the subrule's locals, if present.
Some includes:
#include <boost/spirit/include/qi.hpp> #include <boost/spirit/repository/include/qi_subrule.hpp> #include <boost/spirit/include/phoenix_core.hpp> #include <boost/spirit/include/phoenix_operator.hpp>
Some using declarations:
namespace qi = boost::spirit::qi; namespace repo = boost::spirit::repository; namespace ascii = boost::spirit::ascii;
A grammar containing only one rule, defined with a group of 5 subrules:
template <typename Iterator> struct mini_xml_grammar : qi::grammar<Iterator, mini_xml(), ascii::space_type> { mini_xml_grammar() : mini_xml_grammar::base_type(entry) { using qi::lit; using qi::lexeme; using ascii::char_; using ascii::string; using namespace qi::labels; entry %= ( xml %= start_tag[_a = _1] >> *node >> end_tag(_a) , node %= xml | text , text %= lexeme[+(char_ - '<')] , start_tag %= '<' >> !lit('/') >> lexeme[+(char_ - '>')] >> '>' , end_tag %= "</" >> lit(_r1) >> '>' ); } qi::rule<Iterator, mini_xml(), ascii::space_type> entry; repo::qi::subrule<0, mini_xml(), qi::locals<std::string> > xml; repo::qi::subrule<1, mini_xml_node()> node; repo::qi::subrule<2, std::string()> text; repo::qi::subrule<3, std::string()> start_tag; repo::qi::subrule<4, void(std::string)> end_tag; };
The definitions of the mini_xml
and mini_xml_node
data
structures are not shown here. The full example above can be found here:
../../example/qi/mini_xml2_sr.cpp
This table compares run-time and compile-time performance when converting examples to subrules, with various compilers.
Table 2. Subrules performance
Example |
Compiler |
Speed (run-time) |
Time (compile-time) |
Memory (compile-time) |
---|---|---|---|---|
calc1_sr |
gcc 4.4.1 |
+6% |
n/a |
n/a |
calc1_sr |
Visual C++ 2008 (VC9) |
+5% |
n/a |
n/a |
mini_xml2_sr |
gcc 3.4.6 |
-1% |
+54% |
+32% |
mini_xml2_sr |
gcc 4.1.2 |
+5% |
+58% |
+25% |
mini_xml2_sr |
gcc 4.4.1 |
+8% |
+20% |
+14% |
mini_xml2_sr |
Visual C++ 2005 (VC8) SP1 |
+1% |
+33% |
+27% |
mini_xml2_sr |
Visual C++ 2008 (VC9) |
+9% |
+52% |
+40% |
The columns are:
Subrules push the C++ compiler hard. A group of subrules is a single C++ expression. Current C++ compilers cannot handle very complex expressions very well. One restricting factor is the typical compiler's limit on template recursion depth. Some, but not all, compilers allow this limit to be configured.
g++'s maximum can be set using a compiler flag: -ftemplate-depth
. Set this appropriately if you
use relatively complex subrules.