Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

libs/spirit/doc/x3/tutorial/employee.qbk

[/==============================================================================
    Copyright (C) 2001-2015 Joel de Guzman
    Copyright (C) 2001-2011 Hartmut Kaiser

    Distributed under the Boost Software License, Version 1.0. (See accompanying
    file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
===============================================================================/]

[section:employee Employee - Parsing into structs]

It's a common question in the __spirit_list__: How do I parse and place
the results into a C++ struct? Of course, at this point, you already
know various ways to do it, using semantic actions. There are many ways
to skin a cat. Spirit X3, being fully attributed, makes it even easier.
The next example demonstrates some features of Spirit X3 that make this
easy. In the process, you'll learn about:

* More about attributes
* Auto rules
* Some more built-in parsers
* Directives

First, let's create a struct representing an employee:

    namespace client { namespace ast
    {
        struct employee
        {
            int age;
            std::string forename;
            std::string surname;
            double salary;
        };
    }}

Then, we need to tell __fusion__ about our employee struct to make it a first-class
fusion citizen that the grammar can utilize. If you don't know fusion yet,
it is a __boost__ library for working with heterogeneous collections of data,
commonly referred to as tuples. Spirit uses fusion extensively as part of its
infrastructure.

In fusion's view, a struct is just a form of a tuple. You can adapt any struct
to be a fully conforming fusion tuple:

    BOOST_FUSION_ADAPT_STRUCT(
        client::ast::employee,
        age, forename, surname, salary
    )

Now we'll write a parser for our employee. Inputs will be of the form:

    employee{ age, "forename", "surname", salary }

[#__tutorial_employee_parser__]
Here goes:

    namespace parser
    {
        namespace x3 = boost::spirit::x3;
        namespace ascii = boost::spirit::x3::ascii;

        using x3::int_;
        using x3::lit;
        using x3::double_;
        using x3::lexeme;
        using ascii::char_;

        x3::rule<class employee, ast::employee> const employee = "employee";

        auto const quoted_string = lexeme['"' >> +(char_ - '"') >> '"'];

        auto const employee_def =
            lit("employee")
            >> '{'
            >>  int_ >> ','
            >>  quoted_string >> ','
            >>  quoted_string >> ','
            >>  double_
            >>  '}'
            ;

        BOOST_SPIRIT_DEFINE(employee);
    }

The full cpp file for this example can be found here:
[@../../../example/x3/employee.cpp employee.cpp]

Let's walk through this one step at a time (not necessarily from top to bottom).

[heading Rule Declaration]

We are assuming that you already know about rules. We introduced rules in the
previous [tutorial_roman Roman Numerals example]. Please go back and review
the previous tutorial if you have to.

    x3::rule<class employee, ast::employee> employee = "employee";

[heading Lexeme]

    lexeme['"' >> +(char_ - '"') >> '"'];

`lexeme` inhibits space skipping from the open brace to the closing brace.
The expression parses quoted strings.

    +(char_ - '"')

parses one or more chars, except the double quote. It stops when it sees
a double quote.

[heading Difference]

The expression:

    a - b

parses `a` but not `b`. Its attribute is just `A`; the attribute of `a`. `b`'s
attribute is ignored. Hence, the attribute of:

    char_ - '"'

is just `char`.

[heading Plus]

    +a

is similar to Kleene star. Rather than match everything, `+a` matches one or more.
Like it's related function, the Kleene star, its attribute is a `std::vector<A>`
where `A` is the attribute of `a`. So, putting all these together, the attribute
of

    +(char_ - '"')

is then:

    std::vector<char>

[heading Sequence Attribute]

Now what's the attribute of

    '"' >> +(char_ - '"') >> '"'

?

Well, typically, the attribute of:

    a >> b >> c

is:

    fusion::vector<A, B, C>

where `A` is the attribute of `a`, `B` is the attribute of `b` and `C` is the
attribute of `c`. What is `fusion::vector`? - a tuple.

[note If you don't know what I am talking about, see: [@http://tinyurl.com/6xun4j
Fusion Vector]. It might be a good idea to have a look into __fusion__ at this
point. You'll definitely see more of it in the coming pages.]

[heading Attribute Collapsing]

Some parsers, especially those very little literal parsers you see, like `'"'`,
do not have attributes.

Nodes without attributes are disregarded. In a sequence, like above, all nodes
with no attributes are filtered out of the `fusion::vector`. So, since `'"'` has
no attribute, and `+(char_ - '"')` has a `std::vector<char>` attribute, the
whole expression's attribute should have been:

    fusion::vector<std::vector<char> >

But wait, there's one more collapsing rule: If the attribute is followed by a
single element `fusion::vector`, The element is stripped naked from its container.
To make a long story short, the attribute of the expression:

    '"' >> +(char_ - '"') >> '"'

is:

    std::vector<char>

[heading Rule Definition]

Again, we are assuming that you already know about rules and rule
definitions. We introduced rules in the previous [tutorial_roman Roman
Numerals example]. Please go back and review the previous tutorial if you
have to.

    employee =
        lit("employee")
        >> '{'
        >>  int_ >> ','
        >>  quoted_string >> ','
        >>  quoted_string >> ','
        >>  double_
        >>  '}'
        ;

    BOOST_SPIRIT_DEFINE(employee);

Applying our collapsing rules above, the RHS has an attribute of:

    fusion::vector<int, std::string, std::string, double>

These nodes do not have an attribute:

* `lit("employee")`
* `'{'`
* `','`
* `'}'`

[note In case you are wondering, `lit("employee")` is the same as "employee". We
had to wrap it inside `lit` because immediately after it is `>> '{'`. You can't
right-shift a `char[]` and a `char` - you know, C++ syntax rules.]

Recall that the attribute of `parser::employee` is the `ast::employee` struct.

Now everything is clear, right? The `struct employee` *IS* compatible with
`fusion::vector<int, std::string, std::string, double>`. So, the RHS of `start`
uses start's attribute (a `struct employee`) in-situ when it does its work.

[endsect]