web-agent

There are 28 repositories under web-agent topic.

  • Alibaba-NLP/DeepResearch

    Tongyi Deep Research, the Leading Open-source Deep Research Agent

    Language:Python17k1181661.3k
  • normal-computing/fuji-web

    Fuji is an AI agent that lives in your browser's sidepanel. You can now get tasks done online with a single command!

    Language:TypeScript491137649
  • allenai/lumos

    Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"

    Language:Python47010730
  • OS-Agent-Survey/OS-Agent-Survey

    This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).

  • hcompai/surfer-h-cli

    Run Surfer-H agents powered by Holo1 using the Surfer-H-CLI. Includes example tasks, scripts, and configurations.

    Language:TypeScript1410010
  • thuml/RLVR-World

    Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934

    Language:Python1324
  • CursorTouch/Web-Navigator

    Web-Navigator is an agent for web browsing and scraping websites.

    Language:Python12821529
  • PathOnAIOrg/LiteWebAgent

    [NAACL2025] LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications

    Language:Python1271110720
  • PathOnAIOrg/LiteMultiAgent

    The Library for LLM-based multi-agent applications

    Language:Python91115617
  • FractalAIResearchLabs/Fathom-DeepResearch

    Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval And Synthesis For SLMs

    Language:Python474
  • nottelabs/open-operator-evals

    Opensource benchmark evaluating web operators/agents performance

    Language:Python44117
  • zorazrw/agent-skill-induction

    Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"

    Language:Python326
  • WadeYin9712/UI-Simulator

    Code for 🌍 UI-Simulator: LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

    Language:Python8
  • anaishowland/Captr_MacOS

    Screen recording and computer interaction capture tool that records keyboard/mouse input, screen video, DOM snapshots, and accessibility trees. Perfect for creating datasets to train and evaluate computer-use AI models.

    Language:Python6
  • anaishowland/Captr_Windows

    Screen recording and computer interaction capture tool that records keyboard/mouse input, screen video, DOM snapshots, and accessibility trees. Perfect for creating datasets to train and evaluate computer-use AI models.

    Language:Python3
  • anaishowland/agent-CE

    Agent-CE is a containerized continuous evaluation (CE) platform for web browsing agents. It provides production-ready Docker images and CI/CD pipelines for running and evaluating multiple agent frameworks including Browser Use, Notte, Anthropic Computer Use, and OpenAI Computer Use.

    Language:Python2
  • anaishowland/dataset_creation

    Python scripts for generating and categorizing web browsing tasks for benchmark datasets

    Language:Python2
  • anaishowland/llm-judge-psai

    Evaluation system for computer-use agents that uses LLMs to assess agent performance on web browsing and interaction tasks. This judge system reads screenshots, agent trajectories, and final results to provide detailed scoring and feedback.

    Language:Python2
  • anaishowland/neurosim

    Neurosim is a Python framework for building, running, and evaluating AI agent systems. It provides core primitives for agent evaluation, cloud storage integration, and an LLM-as-a-judge system for automated scoring.

    Language:Python2
  • avrtt/avrtt.github.io

    My React/Gatsby agent-powered all-in-one website and DS/ML course platform designed and written completely from scratch & with love

    Language:MDX2170
  • qunash/web_copilot_ai

    AI-powered Chrome side panel assistant that understands natural language and performs real actions in your browser.

    Language:TypeScript20
  • ThiruvarankanM/WebAgent_AI

    A web application that summarizes the content of any public web page using advanced AI language models.

    Language:Python2
  • anaishowland/computeruse-data-psai

    This dataset contains 3,167 completed tasks of human-computer interactions captured with video, screenshots, DOM snapshots, and detailed interaction events. Created by Paradigm Shift AI for advancing computer use AI agent research.

    Language:Python1
  • ProstoDiary_bot

    gotois/ProstoDiary_bot

    🤖 Curator of semantic transport user stories

    Language:JavaScript122700
  • mzieos/SpidyCrawler

    🤖SpidyCrawler - Synthetic Web Traffic Agent & Anti-Detection.

    Language:Python1
  • PranavMishra17/Flow-Planner

    An AI-powered multi-agent system that automatically captures, documents, and visualizes step-by-step UI workflows for any web application. Powered by Gemini planning, Playwright automation, and Claude vision validation.

    Language:HTML
  • sebber1140/Captr_MacOS

    🎥 Capture screen recordings and interactions on macOS, including inputs and accessibility data, to create datasets for AI model training and evaluation.

    Language:Python