Encapsulates open-source OCR models, table detection, layout recognition, and other capabilities, providing services through a unified interface. Currently, only deepdoc is integrated, with more services to be integrated in the future.
- In deepdoc, pdfplumber is used to read text, while OCR is used to recognize text. The text from pdfplumber is preferred, and OCR is used entirely for scanned documents.
- python >= 3.11 (recommended to use conda)
- GPU > 6G
- tensorrt == 10.0.1
- CUDA == 12.3 (other versions may work theoretically, but have not been tested)
- pycuda == 2024.1
- Install Python 3.11, recommended to use conda.
- Install poetry:
curl -sSL https://install.python-poetry.org | python3 -
- Install dependencies using poetry:
poetry install
- Run the project:
uvicorn main:app
- Install TensorRT, note that the name of tensorrt-cu12 needs to be modified according to the CUDA version.
pip install tensorrt==10.0.1 pip install tensorrt-cu12==10.0.1
- Install pycuda
pip install pycuda == 2024.1
Below are screenshots of my environment for reference:
After starting, you can view the usage methods through the documentation: http://localhost:8000/docs http://localhost:8000/docs