voice_GPT

This project is an intelligent voice assistant web application that integrates speech recognition, natural language processing, and text-to-speech capabilities.

Features

  • Speech-to-text conversion
  • Text-to-speech synthesis
  • Integration with GPT models for natural language processing
  • Customizable API settings
  • Adjustable speech speed
  • Markdown rendering for assistant responses
  • Conversation history management

Usage

  1. Open the HTML file in a modern web browser.
  2. Enter your API Base URL and API Key in the provided input fields.
  3. Select a GPT model from the dropdown menu.
  4. Interact with the assistant using one of the following methods:
    • Type your message in the input field and click "Send" or press Enter.
    • Click the microphone button to start voice input, speak your message, and click again to stop recording.
  5. The assistant will respond with text, which can be read aloud by clicking the speaker icon next to the response.
  6. Adjust the TTS speed using the slider if needed.
  7. Use the "Clear" button to reset the conversation.

Use Cases

This intelligent voice assistant is suitable for various scenarios, including:

  1. Personal productivity: Use it as a hands-free digital assistant for quick information retrieval, task management, or brainstorming.

  2. Accessibility: Provide an alternative interface for users with visual impairments or those who prefer voice interactions.

  3. Language learning: Practice conversations with the AI in different languages, using both text and speech inputs.

  4. Customer support: Implement as a first-line customer service tool, capable of handling basic inquiries and providing information.

  5. Educational tool: Use it as an interactive tutor for various subjects, allowing students to ask questions and receive explanations verbally or in text form.

  6. Content creation: Utilize the AI's language capabilities for generating ideas, outlines, or drafts for written content.

  7. Meeting assistant: Employ during meetings for real-time note-taking, summarization, or quick fact-checking.

  8. Multilingual communication: Leverage the AI's language models to facilitate communication between speakers of different languages.

Requirements

  • Modern web browser with JavaScript enabled
  • Internet connection
  • Valid API key for the chosen GPT model
  • Microphone access (for voice input feature)

Note

Ensure that you have the necessary permissions and comply with the terms of service for the API you are using. Keep your API key secure and do not share it publicly.