Home | Libraries | People | FAQ | More |
It's a common question in the Spirit Mailing List: How do I parse and place the results into a C++ struct? Of course, at this point, you already know various ways to do it, using semantic actions. There are many ways to skin a cat. Spirit X3, being fully attributed, makes it even easier. The next example demonstrates some features of Spirit X3 that make this easy. In the process, you'll learn about:
First, let's create a struct representing an employee:
namespace client { namespace ast { struct employee { int age; std::string surname; std::string forename; double salary; }; }}
Then, we need to tell Boost.Fusion about our employee struct to make it a first-class fusion citizen that the grammar can utilize. If you don't know fusion yet, it is a Boost library for working with heterogeneous collections of data, commonly referred to as tuples. Spirit uses fusion extensively as part of its infrastructure.
In fusion's view, a struct is just a form of a tuple. You can adapt any struct to be a fully conforming fusion tuple:
BOOST_FUSION_ADAPT_STRUCT( client::ast::employee, age, surname, forename, salary )
Now we'll write a parser for our employee. Inputs will be of the form:
employee{ age, "surname", "forename", salary }
Here goes:
namespace parser { namespace x3 = boost::spirit::x3; namespace ascii = boost::spirit::x3::ascii; using x3::int_; using x3::lit; using x3::double_; using x3::lexeme; using ascii::char_; x3::rule<class employee, ast::employee> const employee = "employee"; auto const quoted_string = lexeme['"' >> +(char_ - '"') >> '"']; auto const employee_def = lit("employee") >> '{' >> int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> double_ >> '}' ; BOOST_SPIRIT_DEFINE(employee); }
The full cpp file for this example can be found here: ../../../example/x3/employee.cpp
Let's walk through this one step at a time (not necessarily from top to bottom).
x3::rule<class employee, ast::employee> employee("employee");
lexeme['"' >> +(char_ - '"') >> '"'];
lexeme
inhibits space skipping
from the open brace to the closing brace. The expression parses quoted strings.
+(char_ - '"')
parses one or more chars, except the double quote. It stops when it sees a double quote.
The expression:
a - b
parses a
but not b
. Its attribute is just A
; the attribute of a
.
b
's attribute is ignored.
Hence, the attribute of:
char_ - '"'
is just char
.
+a
is similar to Kleene star. Rather than match everything, +a
matches one or more. Like it's related
function, the Kleene star, its attribute is a std::vector<A>
where A
is the attribute
of a
. So, putting all these
together, the attribute of
+(char_ - '"')
is then:
std::vector<char>
Now what's the attribute of
'"' >> +(char_ - '"') >> '"'
?
Well, typically, the attribute of:
a >> b >> c
is:
fusion::vector<A, B, C>
where A
is the attribute
of a
, B
is the attribute of b
and
C
is the attribute of c
. What is fusion::vector
?
- a tuple.
Note | |
---|---|
If you don't know what I am talking about, see: Fusion Vector. It might be a good idea to have a look into Boost.Fusion at this point. You'll definitely see more of it in the coming pages. |
Some parsers, especially those very little literal parsers you see, like
'"'
, do not have attributes.
Nodes without attributes are disregarded. In a sequence, like above, all
nodes with no attributes are filtered out of the fusion::vector
.
So, since '"'
has no attribute,
and +(char_
- '"')
has a std::vector<char>
attribute, the whole expression's attribute should have been:
fusion::vector<std::vector<char> >
But wait, there's one more collapsing rule: If the attribute is followed
by a single element fusion::vector
,
The element is stripped naked from its container. To make a long story short,
the attribute of the expression:
'"' >> +(char_ - '"') >> '"'
is:
std::vector<char>
employee = lit("employee") >> '{' >> int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> double_ >> '}' ; BOOST_SPIRIT_DEFINE(employee);
Applying our collapsing rules above, the RHS has an attribute of:
fusion::vector<int, std::string, std::string, double>
These nodes do not have an attribute:
lit("employee")
'{'
','
'}'
Note | |
---|---|
In case you are wondering, |
Recall that the attribute of parser::employee
is the ast::employee
struct.
Now everything is clear, right? The struct
employee
IS
compatible with fusion::vector<int, std::string, std::string, double>
.
So, the RHS of start
uses
start's attribute (a struct employee
) in-situ when it does its work.