Fine-Tune LLAVA Repository

This repository demonstrates the process of fine-tuning LLAVA for various tasks, including data parsing and extracting JSON information from images. It provides comprehensive guidance on how to handle different datasets and fine-tune the model effectively.

Repository Structure

Notebooks

data_exploration/
Contains notebooks for exploring the Cord-V2 and DocVQA datasets.
fine-tuning/
Includes:
- A notebook for fine-tuning LLAVA 1.6 7B
- A notebook for testing the fine-tuned model
test_model/
Contains multiple notebooks for testing:
- LLAVA 1.5 7B and 13B
- LLAVA 1.6 7B, 13B, and 34B

Source Code

src/
Contains a Streamlit app to showcase the performance of the fine-tuned model.

To run the dashboard:
1. In Terminal 1:
```
python src/serve_model.py
```
2. In Terminal 2:
```
streamlit run src/app.py
```
Open the dashboard at http://localhost:8501/ and upload sample images from the data folder to view the results. You can find 20 sample images in the data folder.

Installation

Install dependencies from requirements.txt:
```
pip install -r requirements.txt
```

Install additional requirements:

pip install git+https://github.com/huggingface/transformers.git

Repository URL

Clone this repository using:

git clone https://github.com/Farzad-R/Finetune-LLAVA-NEXT.git

Additional Resources

Link to Hyperstack Cloud
HuggingFace Hub to access the model
A link to a YouTube video will be added here soon to provide further insights and demonstrations.
LLAVA-NEXT models.
LLAVA-NEXT info.
LLAVA-NEXT demo.
LLAVA-NEXT GitHub repository.
LLAVA 1.5 demo.
LLAVA 1.5 GitHub repository.