GPT Computer Assistant

gpt-4o for windows, macos and ubuntu
Documentation . Explore the capabilities »

GPT Computer Assistant

Hi, this is an alternative work for providing ChatGPT MacOS app to Windows and Linux. In this way this is a fresh and stable work. You can easily install as Python library for this time but we will prepare a pipeline for providing native install scripts (.exe).

Install and run

*Python 3.9 or higher is required

pip install 'gpt-computer-assistant[base]'

To run gpt-computer-assistant, simply type

computerassistant

demo_out.mp4

Local text-to-speech | NEW

Now GCA just support totaly local text-to-speech with Microsoft Open Source model. For enabling and using it you should run this command:

pip3 install 'gpt-computer-assistant[local_tts]'

After that, just go to LLM setting section and select microsoft_local in tts combobox.

Local speech-to-text | NEW

Now GCA just support totaly local speech-to-text with OpenAI Whisper tiny model. For enabling and using it you should run this commands:

pip3 install 'gpt-computer-assistant[local_stt]'

Installing ffmpeg:

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg

After that, just go to LLM setting section and select openai_whisper_local in stt combobox.

Wake Word

We have added Pvporcupine integration. To use it, you need to install an additional library:

pip3 install 'gpt-computer-assistant[wakeword]'

Error Solutions

Setuptools

pip install setuptools --upgrade

After that, please enter your Pvporcupine API key and enable the wake word feature.

Agent Infrastructure

With this way you can create crewai agents and using it into gpt-computer-assistant gui and tools.

pip3 install 'gpt-computer-assistant[base]'
pip3 install 'gpt-computer-assistant[agentic]'

from gpt_computer_assistant import Agent, start

manager = Agent(
  role='Project Manager',
  goal='understands project needs and assist coder',
  backstory="""You're a manager at a large company.""",
)

coder = Agent(
  role='Senior Python Coder',
  goal='writing python scripts and copying to clipboard',
  backstory="""You're a python developer at a large company.""",
)


start()

Adding Custom Tools

Now you are able to add custom tools that run in the agentic infra and assistant processes.

from gpt_computer_assistant import Tool, start

@Tool
def sum_tool(first_number: int, second_number: int) -> str:
    """Useful for when you need to sum two numbers together."""
    return first_number + second_number

start()

API | NEW

Now you can use your GPT Computer Assistant remotely! GUI still active, for this there is few steps:

pip3 install 'gpt-computer-assistant[base]'
pip3 install 'gpt-computer-assistant[api]'

computerassistant --api

from gpt_computer_assistant.remote import remote

output = remote.input("Hi, how are you today?", screen=False, talk=False)
print(output)

remote.just_screenshot()

remote.talk("TTS test")

# Other Functionalities
remote.reset_memory()
remote.profile("default")

remote.enable_predefined_agents()
remote.disable_predefined_agents()

remote.enable_online_tools()
remote.disable_online_tools()



# Custom tools
remote.install_library("numpy")

@remote.custom_tool
def hobbies():
    "returns hobbies"
    import numpy
    return "Tennis, volleyball, and swimming."


# Create an operation, it will inform the user with top bar animation
with remote.operation("Scanning")
  remote.wait(5)



color_name = remote.ask("What is your favorite color")


remote.set_background_color(255, 255, 255)
remote.set_opacity(200)

remote.set_border_radius(3)

remote.collapse()
remote.expand()

Usage

Use cases

Roadmap

Feature	Status	Target Release
Clear Chat History	Completed	Q2 2024
Long Audios Support (Split 20mb)	Completed	Q2 2024
Text Inputs	Completed	Q2 2024
Just Text Mode (Mute Speech)	Completed	Q2 2024
Added profiles (Different Chats)	Completed	Q2 2024
More Feedback About Assistant Status	Completed	Q2 2024
Local Model Vision and Text (With Ollama, and vision models)	Completed	Q2 2024
Our Customizable Agent Infrastructure	Completed	Q2 2024
Supporting Groq Models	Completed	Q2 2024
Adding Custom Tools	Completed	Q2 2024
Click on something on the screen (text and icon)	Completed	Q2 2024
New UI	Completed	Q2 2024
Native Applications, exe, dmg	Failed (Agentic Infra libraries not supported for now)	Q2 2024
Collaborated Speaking Different Voice Models on long responses.	Completed	Q2 2024
Auto Stop Recording, when you complate talking	Completed	Q2 2024
Wakeup Word	Completed	Q2 2024
Continuously Conversations	Completed	Q2 2024
Adding more capability on device	Completed	Q2 2024
Local TTS	Completed	Q3 2024
Local STT	Completed	Q3 2024
Tray Menu	Completed	Q3 2024
Global Hotkey	On the way	Q3 2024
DeepFace Integration (Facial Recognition)	Planned	Q3 2024

Capabilities

At this time we have many infrastructure elements. We just aim to provide whole things that already in ChatGPT app.

Capability	Status
Local LLM with Vision (Ollama)	OK
Local text-to-speech	OK
Local speech-to-text	OK
Screen Read	OK
Click to and Text or Icon in the screen	OK
Move to and Text or Icon in the screen	OK
Typing Something	OK
Pressing to Any Key	OK
Scrolling	OK
Microphone	OK
System Audio	OK
Memory	OK
Open and Close App	OK
Open a URL	OK
Clipboard	OK
Search Engines	OK
Writing and running Python	OK
Writing and running SH	OK
Using your Telegram Account	OK
Knowledge Management	OK
Add more tool	?

Predefined Agents

If you enable it your assistant will work with these teams:

Team Name	Status
search_on_internet_and_report_team	OK
generate_code_with_aim_team_	OK
Add your own one	?

ue5377/gpt-computer-assistant