🌈 ToyGeniusLab: Unleash Kids' Creativity with AI Toys 🚀

Introduction

ToyGeniusLab invites kids into a world where they can create and personalize AI-powered toys. By blending technology with imaginative play, we not only empower young minds to explore their creativity but also help them become comfortable with harnessing AI, fostering tech skills in a fun and interactive way.

View Demo

Features

🎨 Customizable AI Toys: Kids design their toy's personality and interactions.
📚 Educational: A hands-on introduction to AI, programming, and technology.
💡 Open-Source: A call to the community for ongoing enhancement of software and 3D-printed parts.
🤖 Future Enhancements: Plans to add servos, displays, and more for a truly lifelike toy experience.
🔊 Enhanced Audio Detection: Improved silence detection and audio processing for better interactions.
🎯 Debug Mode: Detailed feedback about audio levels and device status for easier troubleshooting.

Quick Start

Prerequisites

Python 3.x
OpenAI API key
Eleven Labs API key
FFmpeg (brew install ffmpeg)
MPV (brew install mpv)
Required Python packages (see requirements.txt)

Clone and Setup

# Clone the repository
git clone https://github.com/sidu/toygeniuslab.git
cd toygeniuslab

# Install requirements
pip install -r requirements.txt

# Install system dependencies (macOS)
brew install ffmpeg mpv

API Keys Setup

Set up your API keys as environment variables:

# OpenAI API Key
export OPENAI_API_KEY="your-api-key-here"

# Eleven Labs API Key
export ELEVEN_API_KEY="your-eleven-api-key-here"

Running an AI Character Agent

The easiest way to get started is using our example agent implementation:

python character_agent.py --config configs/ghost.yaml

This creates an interactive agent that:

Manages the character's lifecycle
Handles continuous listening and response cycles
Provides visual feedback during interactions
Manages conversation state

Try different character configurations from the configs/ directory or create your own!

Audio Setup and Debugging

The system now includes enhanced audio debugging features:

Displays available audio devices
Shows real-time audio levels
Provides detailed silence detection feedback
Reports ambient noise levels

To optimize audio detection:

Run the program to see audio device information
Monitor the debug output for audio levels
Adjust silence threshold in config if needed

Connecting Bluetooth Microphone and Speaker

Before running the project, make sure you have a portable Bluetooth microphone and speaker connected to your computer. Ensure that they are selected as the default input and output devices. For best experience, we recommend purchasing a mini bluetooth speaker/mic combo, like LEICEX Mini Speaker from Amazon (costs ~$10).

Steps to Connect

Connect your Bluetooth microphone and speaker to your computer following the manufacturer's instructions.
On Windows:
- Right-click on the Speaker icon in the taskbar and select "Open Sound settings."
- Under the "Input" section, select your Bluetooth microphone from the dropdown.
- Under the "Output" section, select your Bluetooth speaker from the dropdown.
On macOS:
- Open "System Preferences" and click on "Sound."
- Go to the "Input" tab and select your Bluetooth microphone.
- Go to the "Output" tab and select your Bluetooth speaker.

Bringing Your Mario AI Toy to Life

Download and print the Mario template.
After pairing a Bluetooth speaker/microphone with your computer, insert it into the paper toy.
Execute the AI toy program by running python pet.py mario.yaml in your terminal. Get ready for interactive fun!

Crafting Your Custom AI Toy

Begin with downloading the blank template. You can digitally color it or use markers and crayons for a hands-on approach. You can also grab a slightly edited version of it from our repo here (has a blank face for more creative options).
Insert a Bluetooth speaker/microphone into your custom-designed toy, ensuring it's paired with your computer first.
Make a copy of an existing toy's config by running cp mario.yaml mytoy.yaml.
Update the system_prompt property in mytoy.yaml according to the personality you want your toy to have.
Optionally, update the voice_id property in mytoy.yaml with the value of the voice you'd like your toy to have from ElevenLabs.io.
Activate your AI toy by executing python pet.py mytoy.yaml in your terminal. Enjoy your creation's company!

Share Your Adventures 📸

Caught a fun moment with your AI toy? We'd love to see it! Share your experiences and creative toy designs on social media using the hashtag #ToyGeniusLab. Let's spread the joy and inspiration far and wide!

Stay Updated with ToyGeniusLab

Love ToyGeniusLab? Give us a ⭐ on GitHub to stay connected and receive updates on new features, enhancements, and community contributions. Your support helps us grow and inspire more creative minds!

Future Horizons 🌈

We're dreaming big for ToyGeniusLab's next steps and welcome your brilliance to bring these ideas to life. Here's what's on our horizon:

More pets
Solid local E2E execution: local LLM, local transcription, local TTS
Local fast transcription and TTS
SD based generation of custom pets
Latency improvements
Interruption handling
Vision reasoning, with local VLLM support
Servos for movement
3D printable characters
"Pet in a box" (Raspberry-Pi)

Help shape ToyGeniusLab's tomorrow: Raise PRs for innovative features or spark conversations in our Discussions. 🌟

How it works

Overview of how the toy works.

Architecture

The project consists of two main components:

AICharacter (ai_character.py)
- Core character capabilities (speech, vision, thinking)
- Audio processing and silence detection
- Visual animation (optional)
- LLM integration (GPT-4, Groq, Ollama)
- Text-to-speech via ElevenLabs
Character Agent (character_agent.py)
- Creates and manages AICharacter instances
- Implements the interaction loop
- Handles user feedback
- Manages conversation flow
- Provides progress indicators

Here's how they work together:

Using AICharacter in Your Projects

You can create your own agent implementation using the AICharacter class. Here's a basic example:

from ai_character import AICharacter

class AICharacterAgent:
    def __init__(self, config_path):
        # Initialize character with configuration
        self.character = AICharacter(config=self.load_config(config_path))
        
        # Add callback for speaking state changes
        self.character.add_speaking_callback(self.speaking_state_changed)
        
    def speaking_state_changed(self, is_speaking):
        if is_speaking:
            print("\nSpeaking", end='', flush=True)
        else:
            print("\nSpeech finished!")
            
    def run(self):
        while True:
            # Listen for user input
            user_input = self.character.listen()
            
            # Get AI response
            response = self.character.think_response(user_input)
            
            # Speak the response
            self.character.speak(response)

AICharacter Main APIs

Initialization

character = AICharacter(config={
    'sampling_rate': 44100,
    'num_channels': 1,
    'dtype': 'float32',
    'silence_threshold': 0.01,
    'system_prompt': 'Your character prompt here',
    # ... other config options
}, debug=False)

Core Methods

listen(): Records audio until silence is detected and returns transcribed text
```
user_input = character.listen()  # Returns transcribed text or None
```

think_response(user_input): Generates AI response based on user input

response = character.think_response("Hello!")  # Returns AI-generated response

speak(text): Converts text to speech and animates the character (if images provided)
```
character.speak("Hello, I'm your AI character!")
```

Speaking State Callbacks

def on_speaking_changed(is_speaking):
    print("Character is speaking:" if is_speaking else "Character finished speaking")

character.add_speaking_callback(on_speaking_changed)

Other Useful Methods

get_speaking_state(): Returns whether the character is currently speaking
cleanup(): Properly closes resources when done

Example Configuration File

sampling_rate: 44100
num_channels: 1
dtype: "float32"
silence_threshold: 0.01
ambient_noise_level_threshold_multiplier: 2.0
silence_count_threshold: 30
max_file_size_bytes: 10485760
enable_lonely_sounds: false
enable_squeak: false
system_prompt: "You are a friendly AI character..."
voice_id: "your-eleven-labs-voice-id"
model: "gpt-4-vision-preview"
character_closed_mouth: "assets/closed.png"  # Optional: for visual animation
character_open_mouth: "assets/open.png"      # Optional: for visual animation
enable_vision: true
greetings:
  - "Hello!"
  - "Hi there!"
lonely_sounds:
  - "sound1.mp3"
  - "sound2.mp3"

See character_agent.py for a complete implementation example.

License

MIT

SidU/ToyGeniusLab