Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for an old version of Boost. Click here to view this page for the latest version.
PrevUpHomeNext

(Experimental) Client-side SQL query formatting

When issuing queries that contain untrusted input, prepared statement are usually the way to go. However, some use cases like dynamic filters or batch inserts are not attainable using them.

The functions described in this section can be used to compose SQL query strings dynamically client-side while keeping your application secure.

[Note] Note

This feature is experimental. Its API may change in subsequent releases.

Common use cases

This feature can also be used to improve efficiency, as text queries perform less round-trips to the server. See this section for more info.

[Warning] Warning

Security considerations: misusing the tools described in this section can lead to SQL injection vulnerabilities. Please read the documentation carefully and thoroughly test your code when using this feature.

Formatting simple queries

You can use format_sql to generate a SQL query from a format string and a set of parameters, using a notation similar to std::format:

std::string employee_name = get_name();  // employee_name is an untrusted string

// Compose the SQL query in the client
std::string query = boost::mysql::format_sql(
    conn.format_opts().value(),
    "SELECT id, salary FROM employee WHERE last_name = {}",
    employee_name
);

// If employee_name is "John", query now contains:
// "SELECT id, salary FROM employee WHERE last_name = 'John'"
// If employee_name contains quotes, they will be escaped as required

// Execute the generated query as usual
results result;
conn.execute(query, result);

format_sql doesn't involve communication with the server. In order to work, it requires a format_options instance describing connection configuration, like the character set currently in use. any_connection::format_opts provides an easy way to retrieve these.

[Note] Note

Getting an unknown_character_set error? Have a look at this section.

All fundamental types can be used with query formatting. This includes integers, floating point types, strings, blobs, dates and times:

std::string query = boost::mysql::format_sql(
    conn.format_opts().value(),
    "SELECT id FROM employee WHERE salary > {}",
    42000
);

ASSERT(query == "SELECT id FROM employee WHERE salary > 42000");

std::optional<T> and boost::optional<T> can also be used:

std::optional<std::int64_t> salary;  // get salary from a possibly untrusted source

std::string query = boost::mysql::format_sql(
    conn.format_opts().value(),
    "UPDATE employee SET salary = {} WHERE id = {}",
    salary,
    1
);

// Depending on whether salary has a value or not, generates:
// UPDATE employee SET salary = 42000 WHERE id = 1
// UPDATE employee SET salary = NULL WHERE id = 1

See this table for a reference of types that have built-in support for SQL formatting.

[Note] Note

Like with std::format, the format string passed to format_sql must be known at compile-time. You can skip this check using the runtime function.

Like std::format, you can use arguments with explicit indices:

// Recall that you need to set connect_params::multi_queries to true when connecting
// before running semicolon-separated queries.
std::string query = boost::mysql::format_sql(
    conn.format_opts().value(),
    "UPDATE employee SET first_name = {1} WHERE id = {0}; SELECT * FROM employee WHERE id = {0}",
    42,
    "John"
);

ASSERT(
    query ==
    "UPDATE employee SET first_name = 'John' WHERE id = 42; SELECT * FROM employee WHERE id = 42"
);

You can also use named arguments, using the initializer list overload:

std::string query = boost::mysql::format_sql(
    conn.format_opts().value(),
    "UPDATE employee SET first_name = {name} WHERE id = {id}; SELECT * FROM employee WHERE id = {id}",
    {
        {"id",   42    },
        {"name", "John"}
    }
);

ASSERT(
    query ==
    "UPDATE employee SET first_name = 'John' WHERE id = 42; SELECT * FROM employee WHERE id = 42"
);

See this section for an in-depth explanation on format strings.

Formatting identifiers

By default, strings are formatted as values. If you need to generate a dynamic SQL identifier (like a field or table name), use the identifier class:

std::string query = boost::mysql::format_sql(
    conn.format_opts().value(),
    "SELECT id, last_name FROM employee ORDER BY {} DESC",
    boost::mysql::identifier("company_id")
);

ASSERT(query == "SELECT id, last_name FROM employee ORDER BY `company_id` DESC");

identifier can also generate qualified identifiers:

std::string query = boost::mysql::format_sql(
    conn.format_opts().value(),
    "SELECT salary, tax_id FROM employee "
    "INNER JOIN company ON employee.company_id = company.id "
    "ORDER BY {} DESC",
    boost::mysql::identifier("company", "id")
);
// SELECT ... ORDER BY `company`.`id` DESC

Building SQL strings incrementally using format_sql_to

Many use cases, including batch inserts and dynamic filters, may not be expressible in terms of a single format string. In such cases, you can use format_context and format_sql_to to build query strings incrementally:

// Compose an update query that sets first_name, last_name, or both
std::string compose_update_query(
    boost::mysql::format_options opts,
    std::int64_t employee_id,
    std::optional<std::string> new_first_name,
    std::optional<std::string> new_last_name
)
{
    // There should be at least one update
    assert(new_first_name || new_last_name);

    // format_context will accumulate the query as we compose it
    boost::mysql::format_context ctx(opts);

    // append_raw adds raw SQL to the generated query, without quoting or escaping.
    // You can only pass strings known at compile-time to append_raw,
    // unless you use the runtime function.
    ctx.append_raw("UPDATE employee SET ");

    if (new_first_name)
    {
        // format_sql_to expands a format string and appends the result
        // to a format context. This way, we can build our query in small pieces
        // Add the first_name update clause
        boost::mysql::format_sql_to(ctx, "first_name = {}", *new_first_name);
    }
    if (new_last_name)
    {
        if (new_first_name)
            ctx.append_raw(", ");

        // Add the last_name update clause
        boost::mysql::format_sql_to(ctx, "last_name = {}", *new_last_name);
    }

    // Add the where clause
    boost::mysql::format_sql_to(ctx, " WHERE id = {}", employee_id);

    // Retrieve the generated query string
    return std::move(ctx).get().value();
}
std::string query = compose_update_query(conn.format_opts().value(), 42, "John", {});

ASSERT(query == "UPDATE employee SET first_name = 'John' WHERE id = 42");

append_raw and append_value can be used on format contexts to add raw SQL and formatted values without a format string.

Unit testing

If you are composing very complex queries, it's very advisable to unit test them. For instance:

// For reference, the function under test
std::string compose_update_query(
    boost::mysql::format_options opts,
    std::int64_t employee_id,
    std::optional<std::string> new_first_name,
    std::optional<std::string> new_last_name
);

// Your test body
void test_compose_update_query()
{
    // You can safely use these format_options for testing,
    // since they are the most common ones.
    boost::mysql::format_options opts{boost::mysql::utf8mb4_charset, true};

    // Test for the different cases
    ASSERT(
        compose_update_query(opts, 42, "Bob", {}) == "UPDATE employee SET first_name = 'Bob' WHERE id = 42"
    );
    ASSERT(
        compose_update_query(opts, 42, {}, "Alice") == "UPDATE employee SET last_name = 'Alice' WHERE id = 42"
    );
    ASSERT(
        compose_update_query(opts, 0, "Bob", "Alice") ==
        "UPDATE employee SET first_name = 'Bob', last_name = 'Alice' WHERE id = 0"
    );
}

In tests, it's safe to manually create a format_options value like the one above. But don't do it in production code!

Solving the unknown_character_set error

If you are getting a boost::system::system_error with a client_errc::unknown_character_set error code (or getting this error code by other means), your connection is currently unaware of the character set it's using, which is required by format operations. Try the following:

[Warning] Warning

Security considerations: don't craft format_options values manually. Always use any_connection::format_opts.

For an explanation on why format_options is necessary and how character set tracking works, please read this section.

Solving the invalid_encoding error

SQL formatting can fail if you provide values that can't be securely formatted. The most common cause is passing string values that are not valid according to the passed character set. This triggers a client_errc::invalid_encoding error:

// If the connection is using UTF-8 (the default), this will throw an error,
// because the string to be formatted contains invalid UTF8.
format_sql(conn.format_opts().value(), "SELECT {}", "bad\xff UTF-8");

You can validate your strings beforehand or handle the error once it happened and reject the input. Other types may also produce format errors.

[Tip] Tip

If you prefer handling errors with error codes, instead of exceptions, use format_sql_to. Please read this section for details.

Efficiency considerations

Both client-side SQL formatting and prepared statements have pros and cons efficiency-wise:

Types with built-in support for SQL formatting

C++ type

Formatted as...

Example

signed char, short, int, long, long long

Integral literal

format_sql(opts, "SELECT {}", 42) == "SELECT 42"
format_sql(opts, "SELECT {}", -1) == "SELECT -1"

unsigned char, unsigned short, unsigned int, unsigned long, unsigned long long

Integral literal

format_sql(opts, "SELECT {}", 42u) == "SELECT 42"

bool

Integral literal 1 if true, 0 if false

format_sql(opts, "SELECT {}", false) == "SELECT 0"
format_sql(opts, "SELECT {}", true) == "SELECT 1"

std::basic_string<char, std::char_traits<char>, Allocator> (including std::string)
string_view

std::string_view
const char*

Single-quoted escaped string literal
Note that LIKE special characters (% and _) are not escaped.

format_sql(opts, "SELECT {}", "Hello world") == "SELECT 'Hello world'"
format_sql(opts, "SELECT {}", "Hello 'world'") == R"(SELECT 'Hello \'world\'')"

std::basic_vector<unsigned char, Allocator> (including blob)
blob_view

Hex string literal

format_sql(opts, "SELECT {}", blob{0x00, 0x48, 0xff}) == R"(SELECT x'0048ff')"

float, except NaN and inf

Floating-point literal, after casting to double.
MySQL does not support NaNs and infinities. Attempting to format these cause a client_errc::unformattable_value error.

// Equivalent to format_sql(opts, "SELECT {}", static_cast<double>(4.2f))
// Note that MySQL uses doubles for all floating point literals
format_sql(opts, "SELECT {}", 4.2f) == "SELECT 4.199999809265137e+00"

double, except NaN and inf

Floating-point literal.
MySQL does not support NaNs and infinities. Attempting to format these cause a client_errc::unformattable_value error.

format_sql(opts, "SELECT {}", 4.2) == "SELECT 4.2e+00"

date

Single quoted, DATE-compatible string literal

format_sql(opts, "SELECT {}", date(2021, 1, 2)) == "SELECT '2021-01-02'"

datetime

Single quoted DATETIME-compatible string literal

format_sql(opts, "SELECT {}", datetime(2021, 1, 2, 23, 51, 14)) == "SELECT '2021-01-02 23:51:14.000000'"

time and std::chrono::duration types convertible to time

Single quoted TIME-compatible string literal

format_sql(opts, "SELECT {}", std::chrono::seconds(121)) == "SELECT '00:02:01.000000'"

std::nullptr_t

NULL

format_sql(opts, "SELECT {}", nullptr) == "SELECT NULL"

boost::optional<T> and std::optional<T>, T being one of the fundamental types above.
Not applicable to custom types or identifier.

Formats the underlying value if there is any.
NULL otherwise.

format_sql(opts, "SELECT {}", optional<int>(42)) == "SELECT 42"
format_sql(opts, "SELECT {}", optional<int>()) == "SELECT NULL"

field and field_view

Formats the underlying value.

format_sql(opts, "SELECT {}", field(42)) == "SELECT 42"
format_sql(opts, "SELECT {}", field("abc")) == "SELECT 'abc'"
format_sql(opts, "SELECT {}", field()) == "SELECT NULL"

identifier

Backtick-quoted, escaped SQL identifier

format_sql(opts, "SELECT {} FROM t", identifier("salary")) == "SELECT `salary` FROM t"
format_sql(opts, "SELECT {} FROM t", identifier("sal`ary")) == "SELECT `sal``ary` FROM t"
format_sql(opts, "SELECT {} FROM t", identifier("mytable", "myfield")) == "SELECT `mytable`.`myfield` FROM t"
format_sql(opts, "SELECT {} FROM t", identifier("mydb", "mytable", "myfield")) == "SELECT `mydb`.`mytable`.`myfield` FROM t"

Custom type that specializes formatter

Calls formatter::format


PrevUpHomeNext