query-pdb is a server-side software for parsing PDB files. The software provides PDB online parsing service, the client only needs to send a request to the server to get the required structures and enumerations, and no longer needs to download the complete PDB file.
- parse PDB online
- fast PDB parsing engine
- cache PDB file
- exchange data via JSON
PDB file is a debugging symbol file for executable programs on windows platform, it records debugging information of executable programs, which includes structures, unions and enumerations. Microsoft provides public PDB files for a number of executables on the Windows platform, including ntdll.dll, ntoskrnl.exe, win32k.sys, and other executables.
Some kernel-level security software needs to parse the internal structure of the operating system, which may vary depending on the system version, so it needs to parse the current operating system executable on the machine and find the corresponding PDB file to get the exact data structure.
The traditional approach is to download the PDB file directly to the client and then parse it on the client side. However, since PDB files are often large (>10M), they consume large network traffic. The software provides PDB online parsing service, the client only needs to send a request to the server to get the required structures and enumerations, and no longer needs to download the complete PDB file.
The IP address of the public test server is http://39.105.177.215:9025
If you want to use this server, you can skip the next step.
Notice: Please do not use this server in a production environment, use your own server if needed.
Download the version that matches your system from the GitHub release page and run it on your server.
Here are some command line configuration options for the software. You can also get help information with the -h
option.
Usage:
query-pdb [OPTION...]
--ip arg ip address (default: 0.0.0.0)
--port arg port (default: 8080)
--path arg download path (default: save)
--server arg download server (default:
http://msdl.microsoft.com/download/symbols/)
--log write log to file (1 means true, 0 means false)
-h, --help print help
When you successfully start the server, a similar message should be displayed.
[2023-02-13 21:36:01.888] [info] create downloader, path: save, server: http://msdl.microsoft.com/download/symbols/
If you are in China, there is a possibility that you cannot access Microsoft PDB server directly, you can use a mirror source instead. See pdb_proxy.
You can send a POST request to your server to test if the server side is working properly. When parsing a PDB file that is not cached on the server for the first time, there may be a large delay because it needs to download the PDB file from the PDB storage server. Subsequent parsing requests will use the cached PDB file for parsing and will be fast.
- get symbols (contain functions and global variables RVA)
send POST request to http://localhost:8080/symbol (replace with your IP and port)
(Don't worry about fields like name
, guid
and age
, I'll explain them later.)
{
"name": "ntkrnlmp.pdb",
"guid": "8F0F3D677778391600F4EB2301FFC7A5",
"age": 1,
"query": [
"KdpStub",
"MmAccessFault",
"xxxxxx"
]
}
Possible return results.
{
"KdpStub": 3773768,
"MmAccessFault": 2454256,
"xxxxxx": -1
}
As you can see, when you request a name that exists, you get the correct result, and if you request a name that does not exist, -1 is returned. In several subsequent requests, we will still use -1 to represent invalid query results.
- get structure offsets (include union)
send POST request to http://localhost:8080/struct (replace with your IP and port)
{
"name": "ntkrnlmp.pdb",
"guid": "8F0F3D677778391600F4EB2301FFC7A5",
"age": 1,
"query": {
"_KPROCESS": [
"UserTime",
"DisableQuantum",
"xxx"
],
"xxx": [
"xxx"
]
}
}
Possible return results.
{
"_KPROCESS": {
"DisableQuantum": {
"bitfield_offset": 2,
"offset": 632
},
"UserTime": {
"bitfield_offset": 0,
"offset": 896
},
"xxx": {
"bitfield_offset": 0,
"offset": -1
}
},
"xxx": {
"xxx": {
"bitfield_offset": 0,
"offset": -1
}
}
}
- get enumeration values
send POST request to http://localhost:8080/enum (replace with your IP and port)
{
"name": "ntkrnlmp.pdb",
"guid": "8F0F3D677778391600F4EB2301FFC7A5",
"age": 1,
"query": {
"_OBJECT_INFORMATION_CLASS": [
"ObjectTypeInformation",
"xxx"
],
"xxx": [
"xxx"
]
}
}
Possible return results.
{
"_OBJECT_INFORMATION_CLASS": {
"ObjectTypeInformation": 2,
"xxx": -1
},
"xxx": {
"xxx": -1
}
}
name
, guid
and age
are important information used to find the PDB file corresponding to the executable. You can read them out directly from the PE format file.
(The following code is not very strict and may access out of bounds when specific PE files are encountered.)
struct raw_debug_info {
DWORD signature;
GUID guid;
DWORD age;
char pdb_file_name[1];
};
pdb_path_info get_pdb_path_info() {
auto p = const_cast<char *>(pe_.c_str());
auto dos_header = reinterpret_cast<IMAGE_DOS_HEADER *>(p);
auto nt_header = reinterpret_cast<IMAGE_NT_HEADERS *>(p + dos_header->e_lfanew);
IMAGE_DATA_DIRECTORY *data_directory;
switch (nt_header->FileHeader.Machine) {
case IMAGE_FILE_MACHINE_I386:
data_directory = reinterpret_cast<IMAGE_NT_HEADERS32 *>(nt_header)->
OptionalHeader.DataDirectory;
break;
case IMAGE_FILE_MACHINE_AMD64:
data_directory = reinterpret_cast<IMAGE_NT_HEADERS64 *>(nt_header)->
OptionalHeader.DataDirectory;
break;
default:
throw std::runtime_error("unsupported machine type");
}
auto debug_directory = reinterpret_cast<IMAGE_DEBUG_DIRECTORY *>(
p + rva_to_offset(p, data_directory[IMAGE_DIRECTORY_ENTRY_DEBUG].VirtualAddress));
auto raw = reinterpret_cast<raw_debug_info *>(p + debug_directory->PointerToRawData);
return parse_raw_debug_info(raw);
}
This is an example. It relies on the following two open source repositories.
- nlohmann JSON for constructing and parsing JSON
- httplib for sending requests
just do it yourself...
The software is based on the CMake build system. If your computer does not have CMake, please install it. You can use the MSVC compiler on Windows and the gcc compiler on Linux.
Require:
- CMake >= 3.20
- OpenSSL >= 1.1.1
Typical build commands are as follows.
cd query-pdb
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --target query_pdb_server --config Release
If you wish to specify a compiler, or add cpp files, please refer to the CMake manual.
Since there are many Windows programmers who use Visual Studio for development, the software also provides a VS-based client project for programmers to easily apply the project to their projects. The Visual Studio solution is located in the client-vs
folder in the root directory. Open the solution and you can compile and run the client application directly.
The project can be seen in Visual Studio's Solution Explorer as containing the source file main.cpp
and the header file query_pdb.h
. You can refer to main.cpp to learn how to use the PDB query functions provided by the query_pdb.h
header file.
In addition to that, if you want to use query_pdb.h
in your project you need to download and introduce the dependencies nlohmann JSON and cpp-httplib for the header file query_pdb.h
in the project property page.
Properties -> VC++ Directories -> Include Directories -> Edit -> New Line
The project provides an example of sending requests directly in kernel mode. I don't want to add too much code to the kernel layer because it might introduce additional instability, but sometimes it's more convenient to send a request directly in the kernel than to get the data in the application and send it to the driver.
This example is also written in C++ and it is based on the ucxxrt runtime library. With ucxxrt, we can write code in the kernel mode just like in the user mode.
Some codes.
try {
kqpdb::set_default_server("http://39.105.177.215:9025/");
kqpdb pdb("\\SystemRoot\\System32\\ntoskrnl.exe");
////////////////////////////////////////////////////////////////////////
// query global offset
{
// method 1
auto offset = pdb.get_symbol("KdpStub");
DbgPrint("KdpStub: %x\n", offset);
}
...
} catch (std::exception &e) {
DbgPrint("[DriverMain] exception: %s\n", e.what());
} catch (...) {
DbgPrint("[DriverMain] exception\n");
}
Possible outputs.
KdpStub: 2deb54
KdpStub: 2deb54
MmAccessFault: 2a4970
DirectoryTableBase: 28, 0
DirectoryTableBase: 28, 0
DisableQuantum: 278, 2
DirectoryTableBase: 28, 0
PagedPool: 1
PagedPool: 1
NonPagedPool: 0
PagedPool: 1
[DriverMain] start successfully
On servers, to provide isolation, services typically run in Docker. The Dockerfile
is provided in the root of the repository for building Docker containers and supervisord.conf
for configuring process monitoring.
You can modify supervisord.conf
to customize the startup options.
[program:query-pdb-server]
command=/query-pdb/build/server/query_pdb_server <your options...>
autostart=true
autorestart=true
Once you have modified the supervisord.conf
, you are ready to build the Docker container. You can use the following command to build it.
cd query-pdb
docker build -t zouxianyu/query-pdb-server .
If you don't like to build your own Docker images, I also provide already built Docker images on Docker Hub. You can download Docker images from Docker Hub with the following command. These images all save the cached PDB files in the save
folder in the root directory, you can map this path to the host so that you can store these PDB files persistently. The mirror labeled microsoft-latest
uses the official Microsoft PDB download server, and the mirror labeled mirror-latest
uses the mirror server within China.
docker pull zouxianyu/query-pdb-server:microsoft-latest
or
docker pull zouxianyu/query-pdb-server:mirror-latest
Then you can run the container with the following command.
Note: <your tag>
needs to be replaced with the tag of the Docker image you are using, such as microsoft-latest
or mirror-latest
. If you built your image yourself, you may need to replace it with latest
or some other tag. If you do not understand the meaning of the following commands, please consult the Docker manual.
docker run -p 80:8080 -itd zouxianyu/query-pdb-server:<your tag>
- cpp-httplib: A C++ header-only HTTP/HTTPS server and client library
- cxxopts: Lightweight C++ command line option parser
- nlohmann JSON: JSON for Modern C++
- raw_pdb: A C++11 library for reading Microsoft Program DataBase PDB files
- spdlog: Fast C++ logging library.
- pdb_proxy: Microsoft PDB Proxy Server
- dwm-screen-shot: Provides ideas for using raw_pdb to parse PDB files
- ucxxrt: The Universal C++ RunTime library, supporting kernel-mode C++ exception-handler and STL.
- KSOCKET: KSOCKET provides a very basic example how to make a network connections in the Windows Driver by using WSK
- httpparser: HTTP request, response and urls parser