As a parent, I want to provide my kid with a fun and engaging activity during the holidays, So that they can stay motivated to draw and be occupied.
This project is a Python application that allows users to upload a drawing, which is then processed to generate a story based on the image. The story is then converted into an audio file, and a new image is generated using AI. The application utilizes:
- Streamlit for creating the web interface.
- Azure Vision API for image captioning.
- gTTs for text to audio.
- OpenAI for generating the story.
- DeepAI for generating a new image.
- Clone the repository:
git clone https://github.com/harmeetsokhi/story-generator.git
cd story-generator
- Install the required dependencies
Poetry install
- Set up the necessary credentials and API keys for Azure Vision API 4.0 , OpenAI, and DeepAI.
- Run the application:
poetry run streamlit run story_app.py
Before running the application, make sure to set up the necessary configurations:
- Setup following environment variables
-
export VISION_ENDPOINT='YOUR_AZURE_VISION_API_END_PONIT with https'
-
export VISION_ENDPOINT_2='YOUR_AZURE_VISION_API_END_PONIT without https and any /'
-
export VISION_KEY='"YOUR_AZURE_VISION_API_KEY"'
-
export OPEN_API_KEY='YOUR OPEN API KEY'
- Once the application is running, you we web browser will open with the app
- The web interface will be displayed, providing a file upload option.
- Click on the "Upload File" button and select the kids drawing you want to process.
- The application will send the image to Azure Vision API for image captioning, and retrieve the generated caption.
- The caption is then used as a prompt for the OpenAI model to generate a story based on the image.
- The generated story is converted into an audio file.
- The DeepAI model is then utilized to create a new image based on the story.
- The resulting story and image are displayed on the web interface.
- Repeat the process to generate stories and images from different drawings.
This project utilizes the following technologies:
- Streamlit - For creating the web application.
- Azure Vision API - For image captioning.
- OpenAI - For generating the story.
- DeepAI - For generating a new image.
If you have any questions, suggestions, or feedback, please feel free to contact :
- Name: Harmeet Sokhi