computer-use
There are 52 repositories under computer-use topic.
bytedance/UI-TARS-desktop
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
web-infra-dev/midscene
Your AI Operator for Web, Android, Automation & Testing.
trycua/cua
Cua is Docker for Computer-Use AI Agents
nanobrowser/nanobrowser
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
Upsonic/Upsonic
The most reliable AI agent framework that supports MCP.
bytebot-ai/bytebot
Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containerized Linux desktop environment.
simular-ai/Agent-S
Agent S: an open agentic framework that uses computers like a human
A9T9/RPA
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
e2b-dev/open-computer-use
AI computer use powered by open source LLMs and E2B Desktop Sandbox
showlab/ShowUI
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
trycua/acu
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
OpenAdaptAI/OpenAdapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
THUDM/CogAgent
An open-sourced end-to-end VLM-based GUI Agent
deedy/mac_computer_use
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
microsoft/WindowsAgentArena
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
inclusionAI/AWorld
Build, evaluate and train General Multi-Agent Assistance with ease
suitedaces/computer-agent
Desktop app powered by Claude’s computer use capability to control your computer
BandarLabs/clickclickclick
A framework to enable autonomous android and computer use using any LLM (local or remote)
baryhuang/mcp-remote-macos-use
The only general AI agent that does NOT requires extra API key, giving you full control on your local and remote MacOs from Claude Desktop App
OS-Agent-Survey/OS-Agent-Survey
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
aditya-nadkarni/spongecake
Spongecake is the easiest way to launch computer use agents.
bilalonur/awesome-llm-os
A curated list of awesome resources, tools, research papers, and projects related to the concept of Large Language Model Operating Systems (LLM-OS).
chatsci/Aeiva
A general AI agent framework that can be adapted to various tasks and environments.
Optexity/ComputerGYM
Foundation Model Training Using Human Demonstrations
presidio-oss/factif-ai
AI-powered computer control for automated testing. Factifai uses vision models (Claude, GPT-4o, Gemini) to interact with applications naturally - clicking, typing, and verifying results just like a human would.
lvqq/intelli-browser
✨ Use natural language to control your browser, powered by LLM and playwright
reidbarber/webmarker
Mark web pages for use with vision-language models
SALT-NLP/PopupAttack
Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups
pnmartinez/simple-computer-use
Open source implementation for computer use, using light OCR models and LLMs. Get Android app in link below.
philfung/computer-use
try Computer Use on your Mac with a few clicks
philfung/awesome-computer-use
Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.
ArchiveBox/abx-spec-behaviors
🧩 Proposal to allow user scripts like "expand comments", "hide popups", "fill out this form", etc. to be reusable across pure browser environments, puppeteer, playwright, extensions, AI tools, and many other contexts with minimal adjustment.
pokemonlabs/iris
This is the crud backend for our QA test application
Justmalhar/claude-ubuntu-os
Claude Computer Use API with Ubuntu that enables Claude to interact with and automate desktop environments. It allows seamless command execution through VNC or noVNC, enhancing productivity with secure, containerized workflows with Github Codespaces.
SawyerHood/computer-use-extension
This is OpenAI's computer use hooked up to a chrome extension.
webhiveos/WebHive
Meet WebHive, the AI-powered browser that takes care of tasks for you. No more endless clicks, tell it what you need, and it gets it done.