Suppose you want to write a Filter to convert UNIX
line endings to DOS
line-endings. The basic idea is simple: you process the characters in a sequence one at a time, and whenever you encounter the character
'\n'
you replace it with the two-character sequence '\r'
, '\n'
. In the following sections I'll implement this algorithm as a stdio_filter
, an InputFilter and an OutputFilter. The source code can be found in the header <libs/iostreams/example/unix2dos_filter.hpp
>
unix2dos_stdio_filter
You can express a UNIX
-to-DOS
Filter as a stdio_filter
by deriving from stdio_filter
and overriding the private
virtual
function do_filter as follows:
#include <cstdio> // EOF #include <iostream> // cin, cout #include <boost/iostreams/filter/stdio.hpp> namespace boost { namespace iostreams { namespace example { class unix2dos_stdio_filter : public stdio_filter { private: void do_filter() { int c; while ((c = std::cin.get()) != EOF) { if (c == '\n') std::cout.put('\r'); std::cout.put(c); } } }; } } } // End namespace boost::iostreams:example
The function do_filter
consists of a straightforward implementation of the algorithm I described above: it reads characters from standard input and writes them to standard output unchanged, except that when it encounters '\n'
it writes '\r'
, '\n'
.
unix2dos_input_filter
Now, let's express a UNIX
-to-DOS
Filter as an InputFilter.
#include <boost/iostreams/categories.hpp> // input_filter_tag #include <boost/iostreams/operations.hpp> // get namespace boost { namespace iostreams { namespace example { class unix2dos_input_filter { public: typedef char char_type; typedef input_filter_tag category; unix2dos_input_filter() : has_linefeed_(false) { } template<typename Source> int get(Source& src) { // Handle unfinished business if (has_linefeed_) { has_linefeed_ = false; return '\n'; } // Forward all characters except '\n' int c; if ((c = iostreams::get(src)) == '\n') { has_linefeed_ = true; return '\r'; } return c; } template<typename Source> void close(Source&); private: bool has_linefeed_; }; } } } // End namespace boost::iostreams:example
The implementation of get
can be described as follows. Most of the time, you simply read a character from src
and return it. The special values EOF
and WOULD_BLOCK
are treated the same way: they are simply forwarded as-is. The exception is when iostreams::get
returns '\n'
. In this case, you return '\r'
instead and make a note to return '\n'
the next time get
is called.
As usual, the member function close
reset's the Filter's state:
template<typename Source> void close(Source&) { skip_ = false; }
unix2dos_output_filter
You can express a UNIX
-to-DOS
Filter as an OutputFilter as follows:
#include <boost/iostreams/concepts.hpp> // output_filter #include <boost/iostreams/operations.hpp> // put namespace boost { namespace iostreams { namespace example { class unix2dos_output_filter : public output_filter { public: unix2dos_output_filter() : has_linefeed_(false) { } template<typename Sink> bool put(Sink& dest, int c); template<typename Sink> void close(Sink&) { has_linefeed_ = false; } private: template<typename Sink> bool put_char(Sink& dest, int c); bool has_linefeed_; }; } } } // End namespace boost::iostreams:example
Here I've derived from the helper class output_filter
, which provides a member type char_type
equal to char
and a category tag convertible to output_filter_tag
and to closable_tag
.
Let's look first at the helper function put_char
:
template<typename Sink> bool put_char(Sink& dest, int c) { bool result; if ((result = iostreams::put(dest, c)) == true) { has_linefeed_ = c == '\r' ? true : c == '\n' ? false : has_linefeed_; } return result; }
This function attempts to write a single character to the Sink dest, returning true
for success. If successful, it updates the flag has_linefeed_
, which indicates that an attempt to write a DOS
line ending sequence failed after the first character was written.
Using put_char
you can implement put
as follows:
bool put(Sink& dest, int c) { if (c == '\n') return has_linefeed_ ? put_char(dest, '\n') : put_char(dest, '\r') ? this->put(dest, '\n') : false; return iostreams::put(dest, c); }
The implementation works like so:
DOS
line-ending sequence — that is, if c
is 'n'
and has_line_feed_
is false
— you attempt to write '\r'
and then '\n'
to dest
.
DOS
line-ending sequence — that is, if c
is 'n'
and has_line_feed_
is true
— you attempt to complete it by writing '\n'
.
c
to dest
.
There are two subtle points. First, why does c == 'n'
and has_line_feed_ == true
mean that you're in the middle of a DOS
line-ending sequence? Because when you attempt to write '\r'
, '\n'
but only the first character succeeds, you set has_line_feed_
and return false
. This causes the user of the Filter to resend the character '\n'
which triggered the line-ending sequence. Second, note that to write the second character of a line-ending sequence you call put
recursively instead of calling put_char
.
Comparing the implementations of unix2dos_input_filter
and unix2dos_output_filter
, you can see that this a case where a filtering algorithm is much easier to express as an Input than as an OutputFilter. If you wanted to avoid the complexity of the above definition, you could use the class template inverse
to construct an OutputFilter from unix2dos_input_filter
:
#include <boost/iostreams/invert.hpp> // inverse namespace io = boost::iostreams; namespace ex = boost::iostreams::example; typedef io::inverse<ex::unix2dos_input_filter> unix2dos_output_filter;
Even this is more work than necessary, however, since line-ending conversions can be handled easily with the built-in component newline_filter
.
© Copyright 2008 CodeRage, LLC
© Copyright 2004-2007 Jonathan Turkanis
Use, modification, and distribution are subject to the Boost Software License, Version 2.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)