/bao

Primary LanguageRustThe UnlicenseUnlicense

bao - the PDB compiler

bao allows you to generate debug information from C code in the CodeView format, which is mostly known for its use in PDBs. JSON is used to assign the types and functions to addresses within the binary.

Showcase

To showcase how bao works I will use a module from Valve's anti-cheat "solution" VAC. It's an ideal sample to test on, because the modules are rather small and as such can be fully analyzed very easily. Complimenting these conditions is the fact that the different modules are similar to each other as all of them I've analyzed contain the ICE cipher.

Before

Before

After applying a generated PDB by Bao

After

Code

int ice_sboxes_initialised;
struct IceKey {
    int		_size;
    int		_rounds;
    struct  IceSubkey	*_keysched;
};

struct IceKey* __thiscall IceKey__IceKey(struct IceKey*, int nRounds);
void* __fastcall VAC_malloc(size_t dwBytes);
void ice_sboxes_init(void);

Configuration

{
  "functions": [
    {
      "name": "IceKey__IceKey",
      "pattern": "56 57 33 FF"
    },
    {
      "name": "VAC_malloc",
      "pattern": "E8 ? ? ? ? 89 7E 04",
      "extra": 1,
      "rip_relative": true,
      "rip_offset": 4
    },
    {
      "name": "ice_sboxes_init",
      "pattern": "75 0B E8 ? ? ? ?",
      "extra": 3,
      "rip_relative": true,
      "rip_offset": 4
    }
  ],
  "globals": [
    {
      "name": "ice_sboxes_initialised",
      "pattern": "47 83 3D ? ? ? ? ?",
      "offsets": [
        3
      ],
      "relative": true
    }
  ]
}

Dependencies

  • LLVM 10 is used to generate the resulting PDB files. This is delegated to the accompanying pdb_wrapper.
  • Clang 10 is used for parsing the C files. This enables us to use the powerful preprocessor.

Known issues

  • enums are not supported (yet?).
  • unions are not supported (yet?).
  • C++ support is experimental and untested.
  • Functions that don't have parameters will not be assigned a type by IDA Pro. This might be a bug in IDA Pro.
  • The GUID and age from the original binary aren't applied to the generated PDB. This means that you'll have to confirm an extra warning dialog in IDA Pro when loading the PDB.
  • The memory model is LLP64 (sizeof(long) == 4), regardless of host platform.

Usage

Generating the PDB as seen in the example is as easy as running bao -c config.json -- vac.dll src/structs.c. The resulting PDB will be saved as vac.dll.pdb in this example. You may pass the -o option to save the resulting PDB somewhere else.

Setup

The easiest way to run bao is to use Docker:

~$ git clone https://github.com/not-wlan/bao.git
~$ cd bao
~$ docker build . -t bao:latest
~$ docker run -v /path/to/project:/project -it bao:latest
#$ cd /project
#$ bao-pdb -o vac.pdb -c vac.json vac.dll structs.c

The first three commands are only necessary on your first run or after an update of bao.

Alternatively you can install the dependencies on your own machine. Be warned though, this is not recommended on a Windows machine!

Usecases

You can use bao to:

  • transfer your reverse engineering efforts from one version of a binary to another one
  • emulate virtual method tables by building structs of function pointers
  • generate PDBs from leaked source code that wouldn't build in its entirety
  • transfer your reverse engineering efforts from one tool to another one

Thanks to

  • FakePDB for inspiring this project and helping me with some of the LLVM API calls.
  • hazedumper-rs for the pattern format.
  • All the people who have helped me along the way.