/Programmable-Voice-Assistant

My Graduation Project

Primary LanguageTypeScriptMIT LicenseMIT

My Assistant

A Customizable, Voice-Assistant Desktop.

ElectronJS Angular SASS Primeng Angular Material Rasa Framework YAML Webhook Multitenancy Model Linguistic Feature Google Calendar API Google Cloud Redis Django Async Channels PostgreSQL Daphne

Your support is very much appreciated! star on GitHub

πŸ“Ή Video Demo:

Project.Demo.mp4

πŸ“œ Table of Contents

πŸŽ‰ About the Project

MA stands for My Assistant. The aim of this project is to develop a programmable voice assistant desktop application that provides users with a highly customizable and extensible interface to interact with their devices. Our goal is to provide a voice assistant that can be tailored to the needs and preferences of individual users, and that is optimized for desktop and laptop computers.
To achieve this aim, we have set the following objectives:

  • Develop a voice assistant that is highly customizable and extensible, allowing users to add their own commands and actions based on their needs and preferences.
  • Optimize the voice assistant for desktop and laptop computers, providing users with a convenient and intuitive interface to interact with their devices.
  • Provide users with a seamless and integrated experience by enabling the voice assistant to interact with other tools and applications on the desktop.
  • Ensure the voice assistant is secure and respects user privacy by implementing robust data privacy and security measures.

πŸ“· Screenshots

My Commands Create Command
Marketplace Tray

🎀 What is Programmable Voice Assistant

Build a customizable virtual voice assistant. The assistant should provide users with more control and flexibility over its features and functionality rather than limiting them to preset options. Allow users to define their own automation scenarios and workflows. Users should be able to craft new commands tailored to their unique needs and preferences. Include traditional voice assistant features. In addition to offering more customization options, the assistant should support all the features of a traditional Voice assistance

πŸ” Why Programmable Voice Assistant

  • Customization: Customize the assistant's behavior and capabilities to suit individual needs and workflows.
  • Flexibility: Design complex automation scenarios and workflows beyond predefined actions.
  • Extensibility: Integrate with external services and APIs to enhance functionality.
  • Open-source Community: Benefit from community-contributed resources for expanded capabilities.
  • Privacy and Security: Host locally for data control and end-to-end encryption.
  • Learning and Exploration: Gain insights into AI and voice-based interaction systems through hands-on experience.

✍️ History Of Voice Assistants

The idea of voice assistants has been around for decades, with the first voice recognition system being introduced in the 1950s by Bell Laboratories. However, it was not until the late 1990s that voice assistants began to gain popularity with the introduction of IBM’s β€œVia Voice” and β€œDragon NaturallySpeaking” software. These early systems were limited by their inability to recognize natural language, their high cost, and the need for specialized hardware. In recent years, voice assistants have become increasingly prevalent due to the widespread adoption of smartphones and the emergence of smart speakers. Apple’s Siri, Amazon’s Alexa, and Google Assistant are some of the most popular voice assistants today. These assistants allow users to interact with their devices using natural language, perform tasks such as setting reminders, playing music, and controlling smart home devices.

❎ The Problem with Traditional Voice Assistants

  • Fixed, Limited Automation Options:
    • Traditional voice assistants provide a predefined set of automation options that are often limited and generic in nature. These assistants offer a restricted range of actions or tasks that can be performed, limiting their usefulness in addressing diverse user needs. Users are confined to the predefined set of commands and actions, without the ability to tailor or expand the assistant's capabilities to match their specific requirements.
  • Lack of Customization:

    • Another drawback of traditional voice assistants is the lack of customization options. Users have limited control over modifying or enhancing the assistant's features to align with their preferences and unique needs. The inability to personalize or customize the assistant's behavior hinders its ability to adapt to individual users' workflows or specific requirements, limiting its overall utility.

βœ… The solution with my assistant

Our voice assistant addresses the limitations of limited customization found in traditional voice assistants by providing users with extensive customization and personalization options. The key features of our solution include:

  • User-Crafted Automation Scenarios:

    • Our voice assistant empowers users to create their own automation scenarios and complex workflows, tailored to their specific needs. Users have the flexibility to define custom commands and actions, enabling them to automate repetitive tasks and streamline their workflows effectively.
  • Easy-to-Use Interface:

    • We offer an intuitive and user-friendly interface that simplifies the process of creating custom commands. Users can easily set up simple phrases or triggers that activate the desired automation, without the need for advanced technical knowledge.
  • Commands Library:

    • To further enhance customization options, our voice assistant includes a comprehensive Commands library. Users can access a collection of pre-built automation commands created by both other users and our core team. This allows users to reuse existing commands, leverage community-contributed automations, and easily expand the capabilities of their voice assistant.

⭐ Features

  • Account Creation and Login:
    • Users can create an account securely to access personalized features, command management, and interaction with the application.
    • The system allows users to log in with their credentials, maintaining user authentication throughout the session.
  • Create New Commands:
    • The system allows users to log in with their credentials, maintaining user authentication throughout the session.
    • Metadata information includes command name, description, parameters, patterns, script, script type, dependency file, and command icon.
    • Uploaded files (script, dependency, icon) are validated, saved, and linked to the command.
  • Edit Existing Commands:
    • Users can edit existing commands by modifying their metadata or uploading new files.
    • The system updates the command accordingly, including retraining the user model or regenerating the executable.
  • Delete Commands:
    • Users can easily delete their commands, and the system handles necessary cleanup tasks, such as removing the executable file and updating the user model.
  • Command Approval Workflow:
    • Users can submit their commands for approval by an admin to make them available in the marketplace.
    • The admin reviews and approves/rejects the command, updating its visibility accordingly.
    • Users receive appropriate feedback regarding the approval status over awesome notifications service.
  • My Command Table:
    • Users can view a table displaying all commands they own, with options to edit and delete each command.
    • The table is visually organized and user-friendly, supporting sorting and filtering options.
  • Marketplace Command Installation:
    • Users can seamlessly install commands from the marketplace.
    • The command is added to the user's installed commands list, and the corresponding executable file is downloaded.
    • The system handles all necessary tasks, such as updating the user model and installing dependencies.
    • Users receive appropriate feedback regarding the installation status over awesome notifications service.
  • Uninstall Installed Commands:
    • Users can uninstall commands they no longer need, and the system handles confirmation and cleanup tasks.
    • The command is removed from the user's installed commands list, and the corresponding executable file is deleted.
    • The system handles all necessary tasks, such as updating the user model and removing dependencies.
    • Users receive appropriate feedback regarding the uninstallation status over awesome notifications service.

πŸ” How My Assistant Works

  • User Interaction: Users interact with your voice assistant through a desktop app with a user-friendly interface.
  • Voice Input: Users can record voice commands using the app's microphone feature or enter text commands if they prefer.
  • Speech-to-Text (STT) Conversion: The recorded voice commands are sent to the Speech-to-Text engine, which converts the audio input into text.
  • Natural Language Processing (NLP): The text input is processed by the Natural Language Processing (NLP) module, powered by the Rasa framework. The NLP module extracts intent and entities from the user's input, understanding the user's request.
  • Command Mapping: The NLP module maps the user's intent to specific commands available in the system, determining the appropriate action to be taken.
  • Command Execution: Based on the command mapping, the system executes the corresponding action or task, such as opening an application, performing a specific operation, or retrieving information.
  • Text-to-Speech (TTS) Conversion: Upon completing the requested task, the response is sent to the Text-to-Speech engine, converting the text into an audible response.
  • Response Playback: The voice assistant plays back the response to the user, providing real-time feedback on the executed action.
  • Customization and Personalization: Your voice assistant stands out by allowing users to create, edit, and manage their own commands, adding a high level of customization and personalization to the user experience.
  • Integration with Marketplace: The app features a marketplace where users can browse and install commands created by others, extending the assistant's capabilities through community-contributed resources.
  • Approval Workflow: Users can submit their custom commands for admin approval. The admin reviews and approves or rejects the command, updating its visibility accordingly.
  • Data Security and Privacy: Your voice assistant prioritizes data security and privacy. The application is self-hosted, ensuring user data remains on the user's device, and end-to-end encryption is applied for secure interactions.

Workflow:

  • The core components of the system are the Desktop App, which serves as the user-facing interface, and the API, which acts as the central component handling communication between various components and external services. The NLP Manager is responsible for natural language processing, while the Executable Builder generates executable files for the commands. The system also integrates with Google's Speech-to-Text and Text-to-Speech APIs for voice-based interactions.

πŸ› οΈTech Stack and Tools

The tools used in this project.

Tool Description
ElectronJS ElectronJS A framework for building cross-platform desktop applications using web technologies.
Angular Angular Platform for building dynamic web applications.
SASS SASS CSS preprocessor for creating scalable and maintainable styles.
Primeng PrimeNG UI component library to enhance the visual and interactive aspects of the application.
Angular Material Angular Material UI component library that follows Google's Material Design guidelines.
Rasa Framework Rasa Framework Framework for natural language processing to understand user commands and interactions.
YAML YAML YAML library used for automating the training process.
Webhook Webhook A way for two applications to communicate with each other by sending HTTP requests.
Multitenancy Multitenancy A way to allow multiple users to share the same application without interfering with each other.
Model Linguistic Feature Model Linguistic Feature A feature that is used to represent the linguistic content of a text.
Google Calendar API Google Calendar API API for interacting with Google Calendar to schedule events.
Google Cloud Google Cloud Cloud platform used for hosting and deploying the application.
Redis Redis In-memory data store used for caching and performance optimization.
Django Django Web framework used for the backend server and database management.
Async Channels Async Channels A library that allows you to create asynchronous communication channels in Django.
PostgreSQL PostgreSQL Relational database management system used for data storage.
Daphne Daphne ASGI server used to deploy Django applications.

πŸͺœ Source Code Directory Structure

A quick look at the top-level files and directories:

.
β”œβ”€β”€ electronApp
β”‚   β”œβ”€β”€ build
β”‚   β”œβ”€β”€ CommandManger
β”‚   β”œβ”€β”€ DB
β”‚   β”‚   β”œβ”€β”€ models
β”‚   β”‚   └── queries
β”‚   β”œβ”€β”€ scriptRunner
β”‚   β”œβ”€β”€ stt
β”‚   β”œβ”€β”€ textToScript
β”‚   β”‚   └── models
β”‚   β”œβ”€β”€ tray
β”‚   └── tts
β”œβ”€β”€ src
β”‚   β”œβ”€β”€ app
β”‚   β”‚   β”œβ”€β”€ auth
β”‚   β”‚   β”‚   β”œβ”€β”€ _helper
β”‚   β”‚   β”‚   β”œβ”€β”€ interface
β”‚   β”‚   β”‚   β”œβ”€β”€ pipes
β”‚   β”‚   β”‚   β”‚   └── only-one-error
β”‚   β”‚   β”‚   β”œβ”€β”€ register-component
β”‚   β”‚   β”‚   β”œβ”€β”€ services
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ auth-service
β”‚   β”‚   β”‚   β”‚   └── not-match-validation
β”‚   β”‚   β”‚   └── user-card
β”‚   β”‚   β”œβ”€β”€ core
β”‚   β”‚   β”‚   └── services
β”‚   β”‚   β”‚       β”œβ”€β”€ electron
β”‚   β”‚   β”‚       └── notification
β”‚   β”‚   β”œβ”€β”€ recorder
β”‚   β”‚   β”‚   β”œβ”€β”€ components
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ audio-visualizer
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ chat
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ home-page
β”‚   β”‚   β”‚   β”‚   └── mic
β”‚   β”‚   β”‚   └── services
β”‚   β”‚   β”‚       β”œβ”€β”€ rasa
β”‚   β”‚   β”‚       β”‚   └── rasa.socket
β”‚   β”‚   β”‚       β”œβ”€β”€ stt
β”‚   β”‚   β”‚       └── tts
β”‚   β”‚   β”œβ”€β”€ scripts-table
β”‚   β”‚   β”‚   β”œβ”€β”€ components
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ command-management
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ abstract-commands
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ create-command-form
β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   └── parameter-field
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ edit-command-form
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ installed-commands
β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   └── installed-commands-service
β”‚   β”‚   β”‚   β”‚   β”‚   └── my-commands
β”‚   β”‚   β”‚   β”‚   β”‚       └── my-command-service
β”‚   β”‚   β”‚   β”‚   └── marketplace-component
β”‚   β”‚   β”‚   β”‚       β”œβ”€β”€ card-preview
β”‚   β”‚   β”‚   β”‚       └── command-card
β”‚   β”‚   β”‚   β”œβ”€β”€ interfaces
β”‚   β”‚   β”‚   └── services
β”‚   β”‚   β”œβ”€β”€ shared
β”‚   β”‚   β”‚   β”œβ”€β”€ components
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ google-token
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ loader
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ modal
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ notifications
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ interfaces
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ notification-card
β”‚   β”‚   β”‚   β”‚   β”‚   └── notification-list
β”‚   β”‚   β”‚   β”‚   └── sidebar
β”‚   β”‚   β”‚   β”œβ”€β”€ directives
β”‚   β”‚   β”‚   β”‚   └── webview
β”‚   β”‚   β”‚   └── snackbar-service
β”‚   β”‚   └── tray
β”‚   β”œβ”€β”€ assets
β”‚   β”‚   β”œβ”€β”€ fonts
β”‚   β”‚   β”œβ”€β”€ i18n
β”‚   β”‚   └── icons
β”‚   └── environments
β”œβ”€β”€ stt
β”‚   └── temp
└── test-files
    1. electronApp: contains all the source code for the electron app.
    1. src: contains all the source code for the angular app.
      • app: contains all the source code for the app.
        • auth: contains all the source code for the authentication module.
        • core: contains all the source code for the core module.
        • recorder: contains all the source code for the recorder module and components {audio-visualizer, chat, home-page, mic}.
        • scripts-table: contains all the source code for the scripts-table module and components {command-management, marketplace-component}.
        • shared: contains all the source code for the shared module and components {google-token, loader, modal, notifications, sidebar}.
        • tray: contains all the source code for the tray module.
      • assets: contains all the assets (ex. images, fonts...)
      • environments: contains all the environment variables.
    1. stt: contains all the source code for the speech-to-text module.
    1. test-files: contains all the test files.

πŸ€” Usage

    1. Install the voice assistant application on your desktop or device.
    1. Launch the application and create a new account or log in securely with your credentials.
    1. Customize your voice assistant by creating new commands. Provide metadata such as name, description, and patterns for each command.
Example: Command to open a website
* 5. Edit existing commands to modify their metadata or update the associated files.
    <b>Example:</b> <i>Command to open a website</i>
* 6. Uninstall commands you no longer need with a simple click and confirm the deletion.
Example: Command to open a website
* 7. Submit custom commands for admin approval through the intuitive interface.
Example: Command to open a website
* 8. Access your personalized command table, where all your commands are listed for easy management.
Example: Command to open a website
* 9. Browse and explore the marketplace to discover and install commands created by other users or the core team.
Example: Command to open a website
* 10. Utilize the voice chat component to interact with the assistant through voice commands.
Example: Command to open a website
* 11. If you encounter any issues with voice commands, use the text input in the chat component as an alternative method.
Example: Command to open a website
* 12. Enjoy the flexibility and extensive customization options offered by your new customizable voice assistant!

πŸš΄β€β™‚οΈ Getting Started

🟑 Prerequisites

πŸ“¦ Package Manager

πŸ”Œ This project uses npm as a package manager

npm install npm@latest -g

πŸ’» Angular CLI: Install the Angular Command Line Interface (CLI) globally on your system. You can do this by running the following command in your terminal or command prompt:

npm install -g @angular/cli

🌐 Electron: Install ElectronJS, which is used for building cross-platform desktop applications. You can install it globally via npm:

npm install -g electron

πŸ“¦ PrimeNG: Install PrimeNG, which is a collection of rich UI components for Angular. You can install it via npm:

npm install primeng --save
npm install primeicons --save

βš™οΈ Configuration Before running the application, you may need to configure some settings. Please refer to the configuration files or documentation provided with the project.

πŸ’» Running the App Now that you've installed the necessary dependencies and configured the project, you can run the app using Angular CLI.

ng serve

πŸŽ‰ Congratulations! You have successfully set up the project and can now explore and interact with the customizable voice assistant application. Happy coding!

πŸ”‘ Environment Requirements

_To run this project, you will need some requirements:_

  • Google Cloud bucket with the following files:

    - google-credentials.json
    - google-token.json
    

πŸ”§ Run for Development

  • Clone the repository
git clone https://github.com/your-username/your-repo.git
  • Install dependencies
npm install
  • Run the app
npm run start

⛑️ Future Work

  • Add Integration with Third-Party Services: Automate interactions with online platforms using voice commands.
  • Expanded Language Support: Enable custom commands in preferred programming languages for users.
  • Workflow and Visualization: Create workflows with multiple commands and intuitive visualization.
  • Multilingual Support: Include Arabic and more languages for broader accessibility.
  • Integration with LLM Models: Improve natural language understanding with LLM models.

β™₯️ Community

The MA community can be found on:

Where you can ask questions, suggest new ideas, and get support.

🐣 Contributors


Mohamed Zaky

πŸ’»

Ahmad Eid

πŸ’»

Ahmad Bedeir

πŸ’»

⚠️ License

Licensed under the GPL-v3 License.