Samples

The Wave library contains several samples illustrating how to use the different features. This section describes these samples and its main characteristics.

The quick_start sample

The quick_start sample shows a minimal way to use the Wave preprocessor library. It simply opens the file given as the first command line argument, preprocesses it assuming that there aren't any additional include paths or macros defined and outputs the textual representation of the tokens generated from the given input file. This sample may be used to introduce yourself to Wave, because it does not contain all the potential additional complexity exposed by more complex samples.

The lexed_tokens sample

The lexed_tokens sample shows a minimal way to use the C++ lexing component of Wave without using the preprocessor. It opens the file specified as the first command line argument and prints out the contents of the tokens returned from the lexer.

The cpp_tokens sample

The cpp_tokens sample dumps out the information contained within the tokens returned from the iterator supplied by the Wave library. It shows, how to use the Wave library in conjunction with custom lexer and custom token types. The lexer used within this sample is SLex [5] based, i.e. it is feeded during runtime (at startup) with the token definitions (regular expressions) and generates a resulting DFA table. This table is used for token identification and is saved to disc afterwards to avoid the table generation process at the next program startup. The name of the file to which the DFA table is saved is wave_slex_lexer.dfa.

The main advantage of this SLex based lexer if compared to the default Re2C [3] generated lexer is, that it provides not only the line information, where a particular token was recognized, but also the related column position. Otherwise the SLex based lexer is functionally fully compatible to the Re2C based one, i.e. you always may switch your application to use it, if you additionally need to get the column information back from the preprocessing.

Since no additional command line parameters are supported by this sample, it won't work well with include files, which aren't located in the same directory as the inspected input file. The command line syntax is straight forward:

    cpp_tokens input_file

The list_includes sample

The list_includes sample shows how the Wave library may be used to generate a include file dependency list for a particular input file. It completely depends on the default library configuration. The command line syntax for this sample is given below:

    Usage: list_includes [options] file ...:
        -h [ --help ]        : print out program usage (this message)
        -v [ --version ]     : print the version number
        -I [ --path ] dir    : specify additional include directory
        -S [ --syspath ] dir : specify additional system include directory

Please note though, that this sample will output only those include file names, which are visible to the preprocessor, i.e. given the following code snippet, only one of the two include file directives is triggered during preprocessing and for this reason only the corresponding file name is reported by the list_includes sample:

    #if defined(INCLUDE_FILE_A)
    #  include "file_a.h" 
    #else
    #  include "file_b.h"
    #endif

The advanced_hooks sample

The advanced_hooks sample is based on the quick_start sample mentioned above. It shows how you may want to use the advanced preprocessing hooks of the Wave library to get in the output not only the preprocessed tokens from the evaluated conditional blocks, but also the tokens recognized inside the non-evaluated conditional blocks. To make the generated token stream useful for further processing the tokens from the non-evaluated conditional blocks are commented out.

Here is a small sample what the advanced_hooks sample does. Consider the following input:

    #define TEST 1
    #if defined(TEST)
    "TEST was defined: " TEST
    #else
    "TEST was not defined!"
    #endif

which will produce as its output:

    //"#if defined(TEST)
    "TEST was defined: " 1
    //"#else
    //"TEST was not defined!"
    //"#endif

As you can see, the sample application prints out the conditional directives in a commented out manner as well.

The wave sample

Because of its general usefulness the wave sample is not located in the sample directory of the library, but inside the tools directory of Boost. The wave sample is usable as a full fledged preprocessor executable on top of any other C++ compiler. It outputs the textual representation of the preprocessed tokens generated from a given input file. It is described in more details here.

The waveidl sample

The main point of the waveidl sample is to show, how a completely independent lexer type may be used in conjunction with the default token type of the Wave library. The lexer used in this sample is supposed to be used for an IDL language based preprocessor. It is based on the Re2C tool too, but recognizes a different set of tokens as the default C++ lexer contained within the Wave library. So this lexer does not recognize any keywords (except true and false, which are needed by the preprocessor itself). This is needed because there exist different IDL languages, where identifiers of one language may be keywords of others. Certainly this implies to postpone keyword identification after the preprocessing, but allows to use Wave for all of the IDL derivatives.

It is only possible to use the Wave library to write an IDL preprocessor, because the token sets for both languages are very similar. The tokens to be recognized by the waveidl IDL language preprocessor is nearly a complete subset of the full C++ token set.

The command line syntax usable for this sample is shown below:

  Usage: waveidl [options] [@config-file(s)] file:


    Options allowed on the command line only:
      -h [ --help ]           : print out program usage (this message)
      -v [ --version ]        : print the version number
      -c [ --copyright ]      : print out the copyright statement
      --config-file filepath  : specify a config file (alternatively: @filepath)

        
    Options allowed additionally in a config file:
      -o [ --output ] path    : specify a file to use for output instead of stdout
      -I [ --include ] path   : specify an additional include directory
      -S [ --sysinclude ] syspath : specify an additional system include directory
      -D [ --define ] macro[=[value]] : specify a macro to define
      -P [ --predefine ] macro[=[value]] : specify a macro to predefine
      -U [ --undefine ] macro : specify a macro to undefine

The hannibal sample

The hannibal sample shows how to base a spirit grammar on the Wave library. It was initially written and contributed to the Wave library by Danny Havenith (see his related web page here). The grammar of this example uses Wave as its preprocessor. It implements around 120 of the approximately 250 grammar rules as they can be found in The C++ Programming Language, Third Edition. The 120 rules allow a C++ source file to be parsed for all type information and declarations. In fact this grammar parses as good as anything, it parses C++ declarations, including class and template definitions, but skips function bodies. If so configured, the program will output an xml dump of the generated parse tree.

It may be a good starting point for a grammar that can be used for things like reverse engineering as some UML modelling tools do. Or whatever use you may find for a grammar that gives you a list of all templates and classes in a file and their members.