Suppose you want to write a filter to remove shell-style comments. The basic algorithm is as follows: you examine characters one at a time, forwarding them unchanged, until you encounter a comment character, typically '#'
. When you find a comment character, you examine and ignore characters until you encounter a newline character, at which point the algorithm begins again. Note that this algorithm consists of two subalgorithms: one algorithm for reading ordinary text, and one for reading comments.
In the next three sections, I'll express this algorithm as a stdio_filter
, an InputFilter and an OutputFilter. The source code can be found in the header <libs/iostreams/example/shell_comments_filter.hpp>
. These examples were inspired by James Kanze's UncommentExtractor.hh
(see [Kanze]).
shell_comments_stdio_filter
You can express a shell comments Filter as a stdio_filter
as follows:
#include <cstdio> // EOF #include <iostream> // cin, cout #include <boost/iostreams/filter/stdio.hpp> class shell_comments_stdio_filter : public stdio_filter { public: explicit shell_comments_stdio_filter(char comment_char = '#') : comment_char_(comment_char) { } private: void do_filter() { bool skip = false; int c; while ((c = std::cin.get()) != EOF) { skip = c == comment_char_ ? true : c == '\n' ? false : skip; if (!skip) std::cout.put(c); } } char comment_char_; }; } } } // End namespace boost::iostreams:example
The implementation of the virtual
function do_filter
is straightforward: The local variable skip
keeps track of whether you are currently processing a comment; the while
loop reads a character c
from std::cin
, updates skip
and writes c
to std::cout
unless skip
is true
.
Filters which derive from stdio_filter
are DualUseFilters, which mean they can be used either for output or for input, but not both simultaneously. Therefore unix2dos_stdio_filter
can be used in place of shell_comments_input_filter
and shell_comments_output_filter
, below.
shell_comments_input_filter
Next you will express a shell comments Filter as an InputFilter. A typical narrow-character InputFilter looks like this:
#include <boost/iostreams/categories.hpp> // input_filter_tag #include <boost/iostreams/char_traits.hpp> // EOF, WOULD_BLOCK #include <boost/iostreams/operations.hpp> // get, read, putback namespace io = boost::iostreams; class my_input_filter { public: typedef char char_type; typedef input_filter_tag category; template<typename Source> int get(Source& src) { // Attempt to produce one character of filtered // data, reading from src as necessary. If successful, // return the character; otherwise return EOF to // indicate end-of-stream, or WOULD_BLOCK } /* Other members */ };
The function get
attempts to produce a single character of filtered output. It accesses the unfiltered character sequence though the provided Source src
, using the fundamental i/o operations get
, read
and putback
. If a character is produced, get
returns it. Otherwise get
returns one of the status codes EOF
or WOULD_BLOCK
. EOF
, which indicates end-of-stream, is a macro defined in the standard header <cstdio>
. WOULD_BLOCK
, which indicates that input is temporarily unavailable, is a constant defined in the namespace boost::iostreams
, in the header <boost/iostreams/char_traits.hpp>
You could also write the above example as follows:
#include <boost/iostreams/concepts.hpp> // input_filter class my_input_filter : public input_filter { public: template<typename Source> int get(Source& src); /* Other members */ };
Here input_filter
is a convenience base class which provides the member types char_type
and category
, as well as no-op implementations of member functions close
and imbue
. I will discuss close
shortly.
You're now ready to express a shell Comments Filter as an InputFilter:
#include <boost/iostreams/char_traits.hpp> // EOF, WOULD_BLOCK #include <boost/iostreams/concepts.hpp> // input_filter #include <boost/iostreams/operations.hpp> // get namespace boost { namespace iostreams { namespace example { class shell_comments_input_filter : public input_filter { public: explicit shell_comments_input_filter(char comment_char = '#') : comment_char_(comment_char), skip_(false) { } template<typename Source> int get(Source& src) { int c; while (true) { if ((c = boost::iostreams::get(src)) == EOF || c == WOULD_BLOCK) break; skip_ = c == comment_char_ ? true : c == '\n' ? false : skip_; if (!skip_) break; } return c; } template<typename Source> void close(Source&) { skip_ = false; } private: char comment_char_; bool skip_; }; } } } // End namespace boost::iostreams:example
Here the member variable skip_
plays the same role as the local variable skip
shell_comments_stdio_filter::do_filter
. The implementation of get
is very similar to that of shell_comments_stdio_filter::do_filter
: the while
loop reads a character c
, updates skip_
and returns c
unless skip_
is true
. The main difference is that you have to handle the special value WOULD_BLOCK
, which indicates that no input is currently available.
So you see that implementing an InputFilter
from scratch is a bit more involved than deriving from stdio_filter
. When writing an InputFilter
you must be prepared to be interrupted at any point in the middle of the algorithm; when this happens, you must record enough information about the current state of the algorithm to allow you to pick up later exactly where you left off. The same is true for OutputFilters
. In fact, many Inputfilters and OutputFilters can be seen as finite state machines; I will formalize this idea later. See Finite State Filters.
There's still one problem with shell_comments_input_filter
: its instances can only be used once. That's because someone might close a stream while the skip_
flag is set. If the stream were later reopened — with a fresh sequence of unfiltered data — the first line of text would be filtered out, regardless of whether it were commented.
The way to fix this is to make your Filter Closable. To do this, you must implement a member function close
. You must also give your filter a category tag convertible to closable_tag
, to tell the Iostream library that your filter implements close
.
The improved Filter looks like this:
namespace boost { namespace iostreams { namespace example { class shell_comments_input_filter : public input_filter { public: shell_comments_input_filter(); template<typename Source> int get(Source& src); template<typename Source> void close(Source&) { skip_ = false; } private: bool skip_; }; } } } // End namespace boost::iostreams:example
Here I've derived from the helper class input_filter
, which provides a member type char_type
equal to char
and a category tag convertible to input_filter_tag
and to closable_tag
. The implementation of close
simply clears the skip_
flag so that the Filter will be ready to be used again.
shell_comments_output_filter
Next, let's express a shell comments Filter as an OutputFilter. A typical narrow-character OutputFilter looks like this:
#include <boost/iostreams/categories.hpp> #include <boost/iostreams/operations.hpp> // put, write namespace io = boost::iostreams; class my_output_filter { public: typedef char char_type; typedef output_filter_tag category; template<typename Sink> bool put(Sink& dest, int c) { // Attempt to consume the given character of unfiltered // data, writing filtered data to dest as appropriate. // Return true if the character was successfully consumed. } /* Other members */ };
The function put
attempts to filter the single character c
, writing filtered output to the Sink dest
. It accesses dest
using the fundamental i/o operations put
and write
. Both of these functions may fail: iostreams::put
can return false
, and iostreams::write
can consume fewer characters than requested. If this occurs, the member function put
is allowed to return false
, indicating that c
could not be consumed. Otherwise, it must consume c
and return true
.
You could also write the above example as follows:
#include <boost/iostreams/concepts.hpp> // output_filter class my_output_filter : public output_filter { public: template<typename Sink> bool put(Sink& dest, int c); /* Other members */ };
Here output_filter
is a convenience base class which provides the member types char_type
and category
, as well as no-op implementations of member functions close
and imbue
.
You're now ready to express a shell comments Filter as an OutputFilter:
#include <boost/iostreams/concepts.hpp> // output_filter #include <boost/iostreams/operations.hpp> // put namespace boost { namespace iostreams { namespace example { class shell_comments_output_filter : public output_filter { public: explicit shell_comments_output_filter(char comment_char = '#') : comment_char_(comment_char), skip_(false) { } template<typename Sink> bool put(Sink& dest, int c) { skip_ = c == comment_char_ ? true : c == '\n' ? false : skip_; if (skip_) return true; return iostreams::put(dest, c); } template<typename Source> void close(Source&) { skip_ = false; } private: char comment_char_; bool skip_; }; } } } // End namespace boost::iostreams:example
The member function put
first examines the given character c
and updates the member variable skip_
; next, unless skip_
is true
, it attempt to write c. The member function close
simply clears the skip_
flag so that the Filter will be ready to be used again.
© Copyright 2008 CodeRage, LLC
© Copyright 2004-2007 Jonathan Turkanis
Use, modification, and distribution are subject to the Boost Software License, Version 2.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)