
A template project that puts together BLIP image captioning and facial feature extraction to emulate a simple web based photo manager.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0


Hobby project in progress. Everything is still in progress. Re-use at your own risk!

Implemented features

- Image indexing (recursive album creation and loading to database)
- Image captioning with multiple GPUs

Upcoming features

- Image face annotation
- Face index



python dependencies

mamba create -n aialbum transformers python=3.10 pytorch fairscale dask-mongo torchaudio pytorch-cuda=11.7 iopath cudatoolkit=11.7  -c pytorch -c nvidia -c iopath -c conda-forge

mamba install rapids=23.02 -c rapidsai -c conda-forge -c nvidia
pip install "fastapi[all]" pillow pillow-heif einops spacy pycocoevalcap cryptography==38.0.4 motor pymongo pyyaml networkx omegaconf timm decord opencv-python webdataset jupyterlab torchvision
pip install tensorflow
pip install gdown

Installation in Raspberry Pi 4 B (8 GB)

mamba create -n aialbum transformers python=3.10 pytorch dask-mongo iopath -c pytorch -c iopath -c conda-forge

pip install fairscale torchaudio "fastapi[all]" pillow pillow-heif einops spacy pycocoevalcap cryptography==38.0.4 motor pymongo pyyaml networkx omegaconf timm opencv-python webdataset jupyterlab torchvision

post install

python -m spacy download en_core_web_sm

Building DLIB library



Docker for mongodb

export UID=$(id -u) 
export GID=$(id -g)

docker-compose up -d mongo

Running the dev server

uvicorn server.__main__:create_app --factory --reload


python -m server

Clean DB entries

python server/clean.py