CRAFTcpp: A C++ repository from yinnhao

C++ implementation of CRAFT text detector with TensorRT

CRAFT: Character-Region Awareness For Text detection | Paper | Official Pytorch code

Overview

This is a C++ implementation for the CRAFT text detector with TensorRT for accelerated inference. Compared to the official PyTorch implementation, it significantly improves text detection efficiency and facilitates deployment.

Upon testing, the inference speed on RTX 4090 is x12 faster than the original CRAFT-pytorch project.

In addition, I have also provided a Chinese and English video subtitle detection model fine-tuned using a custom dataset, which offers higher accuracy in subtitle detection.

Getting started

Requirements

gcc
CUDA
TensorRT

The environment we tested with is GCC 7.3.1 + CUDA 11.2 + TensorRT-8.5.3.1

Generate trt engine

Download the .pth model and place it in the 'pretrained' directory.
- Official pretrained model: craft_mlt_25k.pth
- I used a custom dataset to fine-tune the Chinese and English subtitle detection model：epoch_91.pth

Pth to Onnx

cd engine_generation
python torch2onnx.py --usefp16 --torch_path ../pretrained/craft_mlt_25k.pth

Onnx to trt engine

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/trt/lib
make
./onnx2trt ../pretrained/craft_mlt_25k_fp16.onnx ../pretrained/craft_mlt_25k_fp16_dynamic_shape.cache

Make and run demo

make
```
cd src
make
cd ..
make
```

run demo

(1) If the input file is in image format:

./test_img <engine_path> <input_path>

example:

./test_img ./pretrained/craft_mlt_25k_fp16_dynamic_shape.cache ./images/subtitle2.png

(2) If the input file is in YUV format:

./main <engine_path> <height> <width> <yuv_file_path>