/picturebook-gen

Generate picture books by analyzing text on each page and generating an image based on important parts of the text using pre-trained GPT models and Computer Vision techniques. Designed for easy integration into a Flask web app.

Primary LanguageCSS

Picture Book Generator

This project aims to create a picture book by analyzing pages from a book and generating images based on the important text on each page. We will use a combination of Natural Language Processing (NLP) and Computer Vision techniques to achieve this.

Project Structure

Data Collection

The first step in the project is to collect the pages from the book. This can be done manually, and then we will use Optical Character Recognition (OCR) to convert the book pages into text. Once we have the text for each page, we can use it to generate images.

Text Analysis

Next, we will analyze the text on each page using NLP techniques. We will use a pre-trained NLP model, such as GPT-3, to analyze the text and extract the important parts of the text. Once we have identified the important text, we can convert it into something that [text to image api] AI can understand.

Image Generation

The final step is to generate images based on the important text extracted from the book pages. We will use Computer Vision techniques to generate images based on the text. We will be using GPT3.5 to generate prompts for [text to image api] to execute.

Technologies and Packages

Here are some technologies and packages that can be used for the project:

(bound to change)

  • Programming Language: Python
  • NLP Libraries: GPT-4
  • Computer Vision Libraries: [text to image api]
  • OCR Libraries: Tesseract OCR
  • Web Framework: Flask
  • Database: SQLite

Github Integration

To keep track of the project and collaborate with others, we will use Github. We will create a Github repository for the project and use Git for version control. We will also use Github Issues to track tasks and milestones, and Github Actions for continuous integration and deployment.

Getting Started

First, clone the repository: git clone git@github.com:LLRHook/picturebook-gen.git

Then create a virtual environment: cd repo_name python3 -m venv venv

Activate the environment: .\venv\Scripts\activate

Install the required packages: pip install -r requirements.txt

Then set up the environment variables in the .env file