computer-use
There are 107 repositories under computer-use topic.
bytedance/UI-TARS-desktop
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
trycua/cua
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
web-infra-dev/midscene
Your AI Operator for Web, Android, Automation & Testing.
bytebot-ai/bytebot
Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containerized Linux desktop environment.
simular-ai/Agent-S
Agent S: an open agentic framework that uses computers like a human
Upsonic/Upsonic
Agent Framework For Fintech
A9T9/RPA
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
e2b-dev/open-computer-use
AI computer use powered by open source LLMs and E2B Desktop Sandbox
showlab/ShowUI
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
trycua/acu
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
OpenAdaptAI/OpenAdapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
zai-org/CogAgent
An open-sourced end-to-end VLM-based GUI Agent
deedy/mac_computer_use
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
microsoft/WindowsAgentArena
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
instavm/clickclickclick
A framework to enable autonomous android and computer use using any LLM (local or remote)
suitedaces/computer-agent
Desktop app powered by Claude’s computer use capability to control your computer
baryhuang/mcp-remote-macos-use
The only general AI agent that does NOT requires extra API key, giving you full control on your local and remote MacOs from Claude Desktop App
OS-Agent-Survey/OS-Agent-Survey
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
BrowserOperator/browser-operator-core
Browser Operator - The AI browser with built in Multi-Agent platform! Open source alternative to ChatGPT Atlas, Perplexity Comet, Dia and Microsoft CoPilot Edge Browser
cyberdesk-hq/cyberdesk
Open source virtual desktops for AI agents
LLmHub-dev/open-computer-use
The Open Framework for autonomous virtual computer agents at scale, fully open-source, safe, auditable, and production-ready.
cuga-project/cuga-agent
CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, composable architecture, reasoning modes, and policy-aware features.
aditya-nadkarni/spongecake
Spongecake is the easiest way to launch computer use agents.
BIGPPWONG/EdgeBox
A fully-featured, GUI-powered local LLM Agent sandbox with complete MCP protocol support. Features both CLI and full desktop environment, enabling AI agents to operate browsers, terminal, and other desktop applications just like humans. Based on E2B oss code.
bilalonur/awesome-llm-os
A curated list of awesome resources, tools, research papers, and projects related to the concept of Large Language Model Operating Systems (LLM-OS).
777genius/os-ai-computer-use
AI controls your OS. OS AI Computer Use, OS and API agnostic. For now on Anthropic (Claude) API. Desktop app ready.
chatsci/Aeiva
A general AI agent framework that can be adapted to various tasks and environments.
jeffrey-zang/opus
On-device computer use agent that runs fully in the background 🪄
open-compass/MMBench-GUI
Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including Windows, Linux, macOS, iOS, Android and Web.
openmule/gacua
The World's First Out-of-the-Box Computer Use Agent Powered by Gemini-CLI @openmule
TurixAI/TuriX-CUA
This is the official website for TuriX Computer-use-Agent
AB498/computer-control-mcp
MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.
TongUI-agent/TongUI-agent
Release of code, datasets and model for our work TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
presidio-oss/factif-ai
AI-powered computer control for automated testing. Factifai uses vision models (Claude, GPT-4o, Gemini) to interact with applications naturally - clicking, typing, and verifying results just like a human would.
lvqq/intelli-browser
✨ Use natural language to control your browser, powered by LLM and playwright
reidbarber/webmarker
Mark web pages for use with vision-language models