GPTComputerAutomation

GPTComputerAutomation is a tool that leverages OpenAI's GPT-4o model to analyze screenshots and UI elements, automating mouse clicks and typing actions on your computer.

[Full demo on x.com](https://twitter.com/Charles12509909/status/1796541659628638587

Features

UI Element Capture: Recursively captures visible, clickable UI elements with their coordinates.
Screenshot Functionality: Takes and encodes screenshots for analysis.
GPT Integration: Uses GPT-4o to intelligently determine the next click or typing action based on UI analysis.
Automation Execution: Performs the determined actions on your computer, mimicking user interaction.

Getting Started

Prerequisites

Before you begin, ensure you have met the following requirements:

Python 3.x installed on your machine
Pip (Python package installer)
Git (for cloning the repository)

Installation

Clone the Repository

git clone https://github.com/CK92149/GPTComputerAutomation.git
cd GPTComputerAutomation
```sh

Install the Dependencies
```
pip install -r requirements.txt
```sh
```
Set Up Environment Variables

Create a .env file in the root directory and add your OpenAI API key:
```
API_KEY=your_api_key_here
```sh
```

Usage

Run the Application
```
python main.py
```sh
```
Interact with the GUI
- Enter your request in the input field.
- Click "Perform Action" to let GPT analyze the UI elements and determine the next action.

Example Workflow

Capture UI Elements

The tool captures visible and clickable UI elements on the screen, saving them to a file (ui_elements.txt).
Take a Screenshot

A screenshot is taken and saved as screenshot.png, then encoded to base64 for analysis.
Analyze with GPT

The base64-encoded screenshot and UI elements are sent to the GPT model to determine the most likely next action.
Perform the Action

The determined action (mouse click or typing) is performed on your computer.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

If you want to contact me, you can reach me on x.com @Charles12509909

nullbio/GPTComputerAutomation