Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

This is the documentation for an old version of Boost. Click here to view this page for the latest version.
PrevUpHomeNext

Overview

This section briefly explains the library main classes and functions, and how to use them.

connection is the most important class in the library. A connection is an I/O object, templated on a Stream type. A connection contains an instance of that Stream type and additional state required by the protocol. connection's constructor takes the same arguments as the underlying Stream constructor.

The library defines some typedefs to make things less verbose. The most common one is tcp_ssl_connection. In this case, Stream is boost::asio::ssl::stream<boost::asio::ip::tcp::socket>, which can be constructed from a boost::asio::any_io_executor and a boost::asio::ssl::context:

// The execution context, required to run I/O operations.
boost::asio::io_context ctx;

// The SSL context, required to establish TLS connections.
// The default SSL options are good enough for us at this point.
boost::asio::ssl::context ssl_ctx(boost::asio::ssl::context::tls_client);

// Represents a connection to the MySQL server.
boost::mysql::tcp_ssl_connection conn(ctx.get_executor(), ssl_ctx);

Typedefs for other transports are also available. See this section for more info.

The MySQL client/server protocol is session-oriented. Before anything else, you must perform session establishment, usually by calling connection::connect. This function performs two actions:

  • It establishes the "physical" connection, by calling connect() on the underlying Stream object. For a tcp_ssl_connection, this establishes the TCP connection.
  • It performs the handshake with the MySQL server. This is part of the MySQL client/server protocol. It performs authentication, sets session parameters like the default database to use, and performs the TLS handshake, if required.

connection::connect takes two parameters, one for each of the above actions:

  • The physical endpoint where the server is listening. For TCP streams, this is a boost::asio::ip::tcp::endpoint. For UNIX sockets, it's a boost::asio::local::stream_protocol::endpoint. For TCP, we can resolve a string hostname and port into an endpoint using a resolver object.
  • handshake_params to use for the handshake operation. This parameter doesn't depend on the Stream type. See this section for more info.
// Resolve the hostname to get a collection of endpoints
boost::asio::ip::tcp::resolver resolver(ctx.get_executor());
auto endpoints = resolver.resolve(argv[3], boost::mysql::default_port_string);

// The username and password to use
boost::mysql::handshake_params params(
    argv[1],                // username
    argv[2],                // password
    "boost_mysql_examples"  // database
);

// Connect to the server using the first endpoint returned by the resolver
conn.connect(*endpoints.begin(), params);

Note that connection::connect can only be used with socket-like streams. If your stream is not a socket, you must use the lower-level connection::handshake function. Please read this section for more info.

The two main ways to use a connection are text queries and prepared statements:

Feature

Used for...

Code

Text queries: connection::query.

Simple queries, without parameters:

  • "START TRANSACTION"
  • "SET NAMES utf8"
  • "SHOW TABLES"
results result;
conn.query("START TRANSACTION", result);

Prepared statements:
connection::prepare_statement
connection::execute_statement

Queries with parameters unknown at compile-time.

statement stmt = conn.prepare_statement(
    "SELECT first_name FROM employee WHERE company_id = ? AND salary > ?"
);

results result;
conn.execute_statement(stmt, std::make_tuple("HGS", 30000), result);

When you execute a text query or a prepared statement, you get a results object, which will be the subject of the next section. We will delve deeper into prepared statements later.

In MySQL, a resultset referes to the results generated by a SQL query. The results class is an in-memory reprentation of a MySQL resultset. The following diagram shows an approximate representation of what a resultset looks like:

We can see that a resultset is composed of three pieces of information:

You can obtain a results object by executing a text query (connection::query) or a prepared statement (connection::execute_statement).

All SQL statements generate resultsets. Statements that generate no rows, like INSERTs, generate empty resultsets (i.e. result.rows().empty() == true). The interface to execute SELECTs and INSERTs is the same.

We saw that results::rows returns a matrix-like data structure containing the rows retrieved by SQL query. This library defines six data structures to represent MySQL data:

field

The smallest unit of data. A single "cell" in a MySQL table. This is an owning, variant-like type.

field_view

Like field, but non-owning.

row

An owning, vector-like collection of fields.

row_view

Like row, but non-owning.

rows

An owning, matrix-like collection of fields. Represents several rows of the same size in an optimized way.

rows_view

Like rows, but non-owning.

results::rows returns a rows_view object. The memory for the rows is owned by the results object. Indexing the returned view also returns view objects:

// Populate a results object
results result;
conn.query("SELECT 'Hello world'", result);

// results::rows() returns a rows_view. The underlying memory is owned by the results object
rows_view all_rows = result.rows();

// Indexing a rows_view yields a row_view. The underlying memory is owned by the results object
row_view first_row = all_rows.at(0);

// Indexing a row_view yields a field_view. The underlying memory is owned by the results object
field_view first_field = first_row.at(0);  // Contains the string "Hello world"

Views behave similarly to std::string_view. You must make sure that you don't use a view after the storage it points to has gone out of scope. In this case, you must not use any of the views after the results object has gone out of scope.

As it happens with std::string_view, you can take ownership of a view using its owning counterpart:

// You may use all_rows_owning after result has gone out of scope
rows all_rows_owning{all_rows};

// You may use first_row_owning after result has gone out of scope
row first_row_owning{first_row};

// You may use first_field_owning after result has gone out of scope
field first_field_owning{first_field};

field and field_view are specialized variant-like types that can hold any type you may find in a MySQL table. Once you obtain a field, you can access its contents using the following functions:

  • You can query a field's type by using kind, which returns a field_kind enum.
  • You can query whether a field contains a certain type with field::is_xxx.
  • You can get the underlying value with field::as_xxx and field::get_xxx. The as_xxx functions are checked (they will throw an exception if the actual type doesn't match), while the get_xxx are unchecked (they result in undefined behavior on type mismatch).
  • You can stream fields and compare them for equality.

For example:

results result;
conn.query("SELECT 'abc', 42", result);

// Obtain a field's underlying value using the is_xxx and get_xxx accessors
field_view f = result.rows().at(0).at(0);  // f points to the string "abc"
if (f.is_string())
{
    // we know it's a string, unchecked access
    string_view s = f.get_string();
    std::cout << s << std::endl;  // Use the string as required
}
else
{
    // Oops, something went wrong - schema msimatch?
}

// Alternative: use the as_xxx accessor
f = result.rows().at(0).at(1);
std::int64_t value = f.as_int64();  // Checked access. Throws if f doesn't contain an int
std::cout << value << std::endl;    // Use the int as required

NULL values are represented as field objects having kind() == field_kind::null. You can check whether a value is NULL or not using is_null. This is how NULLs are typically handled:

results result;

// Create some test data
conn.query(
    R"%(
        CREATE TEMPORARY TABLE products (
            id VARCHAR(50) PRIMARY KEY,
            description VARCHAR(256)
        )
    )%",
    result
);
conn.query("INSERT INTO products VALUES ('PTT', 'Potatoes'), ('CAR', NULL)", result);

// Retrieve the data. Note that some fields are NULL
conn.query("SELECT id, description FROM products", result);

for (row_view r : result.rows())
{
    field_view description_fv = r.at(1);
    if (description_fv.is_null())
    {
        // Handle the NULL value
        // Note: description_fv.is_string() will return false here; NULL is represented as a separate
        // type
        std::cout << "No description for product_id " << r.at(0) << std::endl;
    }
    else
    {
        // Handle the non-NULL case. Get the underlying value and use it as you want
        // If there is any schema mismatch (and description was not defined as VARCHAR), this will
        // throw
        string_view description = description_fv.as_string();

        // Use description as required
        std::cout << "product_id " << r.at(0) << ": " << description << std::endl;
    }
}

Every MySQL type is mapped to a single C++ type. The following table shows these mappings:

field_kind

C++ type

MySQL types

is accessor

as accessor

get accessor

int64

std::int64_t

TINYINT, SMALLINT, MEDIUMINT, INT, BIGINT

is_int64

as_int64

get_int64

uint64

std::uint64_t

Unsigned TINYINT, SMALLINT, MEDIUMINT, INT, BIGINT, YEAR, BIT

is_uint64

as_uint64

get_uint64

string

string_view for field_view

std::string for field

CHAR, VARCHAR, TEXT (all sizes), ENUM, SET, DECIMAL, NUMERIC, JSON

is_string

as_string

get_string

blob

blob_view for field_view

blob for field

BINARY, VARBINARY, BLOB (all sizes), GEOMETRY

is_blob

as_blob

get_blob

float_

float

FLOAT

is_float

as_float

get_float

double_

double

DOUBLE

is_double

as_double

get_double

date

date

DATE

is_date

as_date

get_date

datetime

datetime

DATETIME, TIMESTAMP

is_datetime

as_datetime

get_datetime

time

time

TIME

is_time

as_time

get_time

null

Any of the above, when they're NULL

is_null

This section contains more information about fields.

Until now, we've used simple text queries that did not contain any user-provided input. In the real world, most queries will contain some piece of user-provided input.

One approach could be to use string concatenation to construct a SQL query from user input, and then execute it using query(). Avoid this approach as much as possible, as it can lead to SQL injection vulnerabilities. Instead, use prepared statements.

Prepared statements are server-side objects that represent a parameterized query. A statement is represented using the statement class, which is a lightweight object holding a handle to the server-side prepared statement.

Let's say you've got an inventory table, and you're writing a command-line program to get products by ID. You've got the following table definition:

results result;
conn.query(
    R"%(
        CREATE TEMPORARY TABLE products (
            id VARCHAR(50) PRIMARY KEY,
            description VARCHAR(256)
        )
    )%",
    result
);
conn.query("INSERT INTO products VALUES ('PTT', 'Potatoes'), ('CAR', 'Carrots')", result);

You can prepare a statement to retrieve products by ID using:

statement stmt = conn.prepare_statement("SELECT description FROM products WHERE id = ?");

You can execute the statement using connection::execute_statement:

// Obtain the product_id from the user. product_id is untrusted input
const char* product_id = argv[2];

// Execute the statement
results result;
conn.execute_statement(stmt, std::make_tuple(product_id), result);

// Use result as required

The statement object is passed as first parameter, which tells the server which statement it should execute. Actual parameters are passed as the second argument, as a std::tuple. You must pass as many parameters as ? placeholders the statement has.

To learn more about prepared statements, please refer to this section.

Until now, we've been using connection::query and connection::execute_statement, which execute some SQL and read all generated data into an in-memory results object.

Some use cases may not fit in this simple pattern. For example:

  • When reading a very big resultset, it may not be efficient (or even possible) to completely load it in memory. Reading rows in batches may be more adequate.
  • If rows contain very long TEXT or BLOB fields, it may not be adequate to copy these values from the network buffer into the results object. A view-based approach may be better.

For these cases, we can break the query() or execute_statement() operation into several steps, using a multi-function operation (the term is coined by this library). This example reads an entire table in batches, which can be the case in an ETL process:

// Create the table and some sample data
// In a real system, body may be megabaytes long.
results result;
conn.query(
    R"%(
        CREATE TEMPORARY TABLE posts (
            id INT PRIMARY KEY AUTO_INCREMENT,
            title VARCHAR (256),
            body TEXT
        )
    )%",
    result
);
conn.query(
    R"%(
        INSERT INTO posts (title, body) VALUES
            ('Post 1', 'A very long post body'),
            ('Post 2', 'An even longer post body')
    )%",
    result
);

// execution_state stores state about our operation, and must be passed to all functions
execution_state st;

// Writes the query request and reads the server response, but not the rows
conn.start_query("SELECT title, body FROM posts", st);

// Reads all the returned rows, in batches.
// st.complete() returns true once there are no more rows to read
while (!st.complete())
{
    // row_batch will be valid until conn performs the next network operation
    rows_view row_batch = conn.read_some_rows(st);

    for (row_view post : row_batch)
    {
        // Process post as required
        std::cout << "Title:" << post.at(0) << std::endl;
    }
}
[Warning] Warning

Once you start a multi-function operation with connection::start_query or connection::start_statement_execution, the server immediately sends all rows to the client. You must read all rows before engaging in further operations. Otherwise, you will encounter packet mismatches, which can lead to bugs and vulnerabilities!

Multi-function operations are powerful but complex. Only use them when there is a strong reason to do so. Please refer to this section for more information on these operations.

The functions we've been using communicate errors throwing exceptions. There are also non-throwing overloads that use error codes.

If the server fails to fulfill a request (for example, because the provided SQL was invalid or a referenced table didn't exist), the operation is considered failed and will return an error. The server provides an error message that can be accessed using the diagnostics class. For example:

error_code ec;
diagnostics diag;
results result;

// The provided SQL is invalid. The server will return an error.
// ec will be set to a non-zero value
conn.query("this is not SQL!", result, ec, diag);

if (ec)
{
    // The error code will likely report a syntax error
    std::cout << "Operation failed with error code: " << ec << '\n';

    // diag.server_message() will contain the classic phrase
    // "You have an error in your SQL syntax; check the manual..."
    // Bear in mind that server_message() may contain user input, so treat it with caution
    std::cout << "Server diagnostics: " << diag.server_message() << std::endl;
}

With exceptions, this would be:

try
{
    // The provided SQL is invalid. This function will throw an exception.
    results result;
    conn.query("this is not SQL!", result);
}
catch (const error_with_diagnostics& err)
{
    // error_with_diagnostics contains an error_code and a diagnostics object.
    // It inherits from boost::system::system_error.
    std::cout << "Operation failed with error code: " << err.code() << '\n'
              << "Server diagnostics: " << err.get_diagnostics().server_message() << std::endl;
}

As with Boost.Asio, every sync operation has an async counterpart. This library follows Asio's async model, so you may use async operations with any valid Asio CompletionToken, including callbacks and coroutines.

For example, if you can use C++20, you can write:

// Using this CompletionToken, you get C++20 coroutines that communicate
// errors with error_codes. This way, you can access the diagnostics object.
constexpr auto token = boost::asio::as_tuple(boost::asio::use_awaitable);

// Run our query as a coroutine
diagnostics diag;
results result;
auto [ec] = co_await conn.async_query("SELECT 'Hello world!'", result, diag, token);

// This will throw an error_with_diagnostics in case of failure
boost::mysql::throw_on_error(ec, diag);

The examples section contains material that can help you. This section also provides more info on this topic.

Single outstanding operation per connection

At any given point in time, a connection may only have a single async operation outstanding. This is because the connection uses the underlying Stream object directly, without any locking or queueing. If you perform several async operations concurrently on a single connection without any serialization, the stream may interleave reads and writes from different operations, leading to undefined behavior.

For example, connection::async_query performs both reads and writes. Doing this is illegal and should be avoided:

// Coroutine body
// DO NOT DO THIS!!!!
results result1, result2;
co_await (
    conn.async_query("SELECT 1", result1, boost::asio::use_awaitable) &&
    conn.async_query("SELECT 2", result2, boost::asio::use_awaitable)
);

If you need to perform queries in parallel, open more connections to the server.


PrevUpHomeNext