classifies OCR output of tweet screenshots to search for tweets
- pytesseract
- TweetCapture (and it's requirements)
- twitter API access (for generating and annotating screensshots)
- requests
- sklearn
- pandas
- Pillow
setup the config.yaml with your local paths and bearer token unpack the training data to use it
- run inference.py with an image path and get a search link back (CLI implemnted)
>inference.py
usage: inference.py [-h] [-p] [-t] [input_img]
positional arguments:
input_img path to an .png image of a tweet screenshot. If none is given, it will run as -t
optional arguments:
-h, --help show this help message and exit
-p, --preds prints prediced name, author, text, source
-t, --test ignores input and tries a random screenshot from the tests folder
- run screenshot_handling to generate and annotate more data (CLI not available)
- run training.py to train the model (CLI not available)