MySQLd BPF testing

MySQL 5.7 has DTrace support, which can be used as USDT tracepoints. Unfortunately, DTrace support was removed in 8.0, because presumably no one used it, as DTrace is only recently supported on Linux. This repo has a BPF program that can trace both via the DTrace tracepoints, as well as via uprobe. The tracer script is written at ./mysql/trace/mysql_query_tracer.py. There are four trace functions: do_trace_start_usdt, do_trace_done_usdt, do_trace_start_uprobe, do_trace_done_uprobe.

DTrace/USDT

Example output (columns are: timestamp, time taken in ms, query length (truncated), query):

$ scripts/trace-simple mysql57 -m usdt
26528791.634560782      0.5     6 (0)   b'BEGIN'
26528791.634793196      1.1     34 (0)  b'SELECT c FROM sbtest12 WHERE id=?'
26528791.635121044      1.1     34 (0)  b'SELECT c FROM sbtest12 WHERE id=?'
26528791.635502957      0.9     34 (0)  b'SELECT c FROM sbtest12 WHERE id=?'
26528791.635441314      0.5     6 (0)   b'BEGIN'
26528791.635673296      1.0     33 (0)  b'SELECT c FROM sbtest8 WHERE id=?'
26528791.636214893      1.3     34 (0)  b'SELECT c FROM sbtest12 WHERE id=?'
26528791.636536162      1.1     34 (0)  b'SELECT c FROM sbtest12 WHERE id=?'
26528791.636299416      2.9     33 (0)  b'SELECT c FROM sbtest8 WHERE id=?'
26528791.636855353      2.3     33 (0)  b'SELECT c FROM sbtest8 WHERE id=?'

MySQL in 5.7 has built-in support for DTrace tracepoints. The tracepoints are organized through the entire query process as follows:

tracepoints

Based on experimentation, I've observed that prepared queries only fire the query-exec-start and query-exec-done tracepoints and skip the query-start and query-done tracepoints. So the scripts in this repo is based on query-exec-start and query-exec-done.

References:

UProbe

Example output (columns are: timestamp, time taken in ms, query length (truncated), query):

$ scripts/trace-simple mysql57 -m uprobe
26528778.247213945      5.8     58 (0)  b'SELECT c FROM sbtest13 WHERE id BETWEEN ? AND ? ORDER BY c'
26528778.247934274      2.0     56 (0)  b'INSERT INTO sbtest16 (id, k, c, pad) VALUES (?, ?, ?, ?)'
26528778.248027775      13.3    67 (0)  b'SELECT DISTINCT c FROM sbtest15 WHERE id BETWEEN ? AND ? ORDER BY c'
26528778.249757726      2.2     35 (0)  b'UPDATE sbtest4 SET k=k+1 WHERE id=?'
26528778.250416216      1.1     33 (0)  b'UPDATE sbtest7 SET c=? WHERE id=?'
26528778.250926454      1.4     30 (0)  b'DELETE FROM sbtest7 WHERE id=?'
26528778.251274034      0.8     55 (0)  b'INSERT INTO sbtest7 (id, k, c, pad) VALUES (?, ?, ?, ?)'
26528778.244629823      123.4   6 (0)   b'COMMIT'
26528778.257279973      0.1     5 (0)   b'BEGIN'
26528778.25748157       0.7     33 (0)  b'SELECT c FROM sbtest15 WHERE id=?'

MySQL deprecated Dtrace in 5.7 and removed it in 8.0. This means that it can no longer be traced via USDT. The present repo contains a BPF program that can trace query-exec-start and query-exec-done via uprobe and the mysql_execute_command function. The query is extracted by reading the first argument of mysql_execute_command which is object of class THD (for 5.7 and 8.0 at least) and then finding the offset for the query in this class. This means the script may have to change if MySQL version updates as the THD class might change. USDT doesn't have this issue as the query is emitted from the code. It also means that the more granular data available via dtrace will not be easily (nor stably) exposable via the uprobe system.

To figure out that I had to trace the mysql_execute_command function:

  1. I noted that in MySQL 5.7, the query__exec__start and query__exec__done tracepoints corresponds to the macro MYSQL_QUERY_EXEC_START and MYSQL_QUERY_EXEC_END.
  2. By reading through the MySQL source code, one can see that there are four places where these tracepoints are placed: 1, 2, 3, 4. In every location, the function mysql_execute_command is called. Based on this information, it seems like mysql_execute_command is the function to trace if we want to replicate the query__exec__* tracepoints.
  3. The query string is stored in a member variable of the first argument, which is a pointer to the THD class. The member variable for the entire query string is called m_query_string, which is itself a LEX_CSTRING, which is a st_mysql_const_lex_string that has a pointer to const char and a uint64 length field. These are the fields we need to read in the BPF program in this repo.
  4. To read the actual query string, we need to find the offset to m_query_string. To do this, I compiled Percona server 5.7.35-38 from source and used pahole to examine sql/CMakeFiles/sql.dir/sql_class.cc.o and found the following output which indicates that the query is at an offset of 472 bytes:
LEX_CSTRING                m_query_string;       /*   472    16 */
class String              m_normalized_query;    /*   488    32 */

The same process can be followed for any version of MySQL, and for any other variables that one would like to look at (although the BPF program will have to be changed).

I also saved a copy of the pahole output in this repo, under the data directory.

Running this

Setup:

$ docker-compose up -d
$ scripts/prepare-db mysql57
$ scripts/prepare-db mysql80

Loading MySQL with queries:

$ scripts/load-db mysql57 # For loading the 5.7 instance
$ scripts/load-db mysql80 # For loading the 8.0 instance.

Tracing every query:

$ scripts/trace-simple mysql57 -m usdt
$ scripts/trace-simple mysql57 -m uprobe
$ scripts/trace-simple mysql80 -m uprobe

Tracing with a historgram:

TODO