Filters are used to modify character sequences. For example, you might use a filter to replace all instances of one word with another, to convert all alphabetic characters to lower case or to encrypt a document. Sometimes the filter is a mere observer; in this case the filtered character sequence if the same as the unfiltered sequence. For example, you might use a filter to count the number of occurrences of a given word.
The Iostreams library supports two basic categories of Filters, InputFilters and OutputFilters. InputFilters represent a “pull” model of filtering: a source of unfiltered data is provided — represented as a Source — and the Filter is expected to generate a certain number of characters of the filtered sequence. The filtered sequence is generated incrementally, meaning that to filter a given character sequence the Filter typically must be invoked several times. OutputFilters represent a “push” model of filtering: a sequence of unfiltered characters and a Sink are provided, and the Filter is expected to filter the characters and write them to the Sink. Like InputFilters, OutputFilters also process data incrementally.
The simplest InputFilters and OutputFilters process characters one at a time. This type of Filter is easy to write, but is less efficient than Filters that process several characters at a time. Filters which process several characters at a time are called Multi-Character filters.
The Iostreams library provides several utilities to make Filter writing easier:
aggregate_filter
allows a programmer to define a Filter by reading unfiltered data from one std::vector
and writing filtered data to another std::vector
.
stdio_filter
allows a programmer to define a Filter by reading unfiltered data from standard input and writing filtered data to standard output.
symmetric_filter
allows a programmer to define a Filter by reading unfiltered data from one array and writing filtered data to another array.
finite_state_filter
allows a programmer to define a Filter as a finite state machine. This component is included with the example filters; it is not currently an official part of the library.
Suppose you need to write a Filter to perform a given filtering task. How do you decide whether to write an InputFilter or OutputFilter, or to use one of the Filter helpers? The first two Filter helpers mentioned above, aggregate_filter
and stdio_filter
, have high-memory usage and only work with character sequences that have a well-defined end. They allow filtering algorithms to be expressed in a very straightforward way, however, and so provide a good introduction to filtering. The third Filter helper, symmetric_filter
, is useful for defining filter based on C-language API such as zlib, libbz2 or OpenSSL. If none of the Filter helpers are appropriate, you should generally write an InputFilter if you plan to use the filter for reading and an OutputFilter if you plan to use it for writing. In some cases, however, it is much easier to express an algorithm as an InputFilter than as an OutputFilter, or vice versa. In such cases, you can write the filter whichever way is easier, and use the class template inverse
or the function template invert
to turn an InputFilter into an OutputFilter or vice versa.
In all but the last of the filtering examples below, I will first show how to implement an algorithm using stdio_filter
before implementing it from scratch as an InputFilter or OutputFilter.
© Copyright 2008 CodeRage, LLC
© Copyright 2004-2007 Jonathan Turkanis
Use, modification, and distribution are subject to the Boost Software License, Version 2.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)