/voicedraw-generative-ai

A collection of examples showcasing diverse forms and models within generative artificial intelligence.

Primary LanguagePython

VoiceDraw

  1. VoiceDraw: VoiceDraw is an artificial intelligence project enabling image and voice generation through various modalities.

  2. Components:

    • recorder.py: This module is designed for voice recording functionality.
    • painter.py: Responsible for converting textual inputs into images.
    • transcriptor.py: Handles the conversion of audio data into text format.
  3. Framework Structure:

    • app.py: Central file housing the framework structure. It integrates all the functionalities provided by recorder.py, painter.py, and transcriptor.py.
  4. Streamlit Web Interface:

    • Utilizes the Streamlit web interface structure for user interaction.
  5. Example of Multimodal Functionality:

    • Demonstrates the capability of the system to work with multiple modalities simultaneously, including voice and text inputs for image generation and voice output.