/bds-hw3a

HW3a for NTU 112-1 Big Data System Course

Primary LanguagePythonMIT LicenseMIT

Stage-A Document Intelligence

Installation

conda create -n docint python=3.11
conda activate docint
conda install -c conda-forge ghostscript
pip install -r requirements.txt

Target

Create an artificial intelligence that searches in which table in the given pdf files has the desired information.

Input

  1. pdf files with only tables inside
  2. the searching keywords

Output

the hole table with desired information in it

Example

The given pdf file:
image Search query:

非監督式學習的應用

Output: image

How to contribute

  • Every one finishes the whole project and pulls the requests , do not edit the main branch
  • if your code is acceptable, we will add it into the main branch

Background Knowledge

Azure Document Intelligence

Test Document

Document 1 Document 2