CQ-CPP

Implementation of Paper Compact Token Representations with Contextual Quantization for Efficient Document Re-ranking.

Models

All the pretrained models are kept in /model folder. We are primarily using .pt files and loading them as models in the code. For BERT, we are using model.proto as pretrained weights. Along with this vocabulary and codebook files are also present in the /model folder.

Dependencies

Protobuf

Bert uses protobuf to convert pytorch pretrained model in protobuf (.proto) file and load it in C++. Make sure to download protobuf. One of the ways to install protobuf: pip install protobuf

MKL

Bert uses MKL to implement bias operator. To install: pip install mkl Make sure that the /bin folder of MKL should be present in /opt/intel/mkl, otherwise you might need to make changes in CMakeLists.txt

utf8proc

Bert uses utf8proc to process input string. To install: sudo apt-get install libutf8proc-dev

libtorch

We are using PyTorch C++ to carry out various neural netowrk operations. Install the stable version of libtorch for C++ from here: C++ Pytorch. libtorch should be present in /cq-cpp.

Build

mkdir build 
cd build
cmake .. -DCMAKE_MODULE_PATH=/path/to/cq-cpp
make -j4
./bert-sample (This will be the entry point to run a sample code)

Quick Info

The main() function resides in bert-sample.cpp file. All the necessary documentation fo various functions is included in the code.

Thanks

BERTCPP model is taken from here BERTCPP

ajitJJadhav/cq-cpp