Pinned Repositories
atomacos
Automated Testing on macOS
Awesome-LLM-Reasoning
Papers and resources on Reasoning in Language Models (LLMs), including Chain-of-Thought, Instruction-Tuning, Multimodality.
OmniMCP
OmniMCP uses Microsoft OmniParser and Model Context Protocol (MCP) to provide AI models with rich UI context and powerful interaction capabilities.
OmniParser
OpenAdapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
OpenAdapt.web
OpenSanitizer
A privacy-focused module for detecting and scrubbing PII/PHI from screen data and user actions.
PydanticPrompt
A simple library to document Pydantic models for structured LLM outputs using standard Python docstrings.
pynput
Sends virtual input commands
SoM
Set-of-Mark Prompting for LMMs
OpenAdapt.AI's Repositories
OpenAdaptAI/OpenAdapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
OpenAdaptAI/OmniMCP
OmniMCP uses Microsoft OmniParser and Model Context Protocol (MCP) to provide AI models with rich UI context and powerful interaction capabilities.
OpenAdaptAI/SoM
Set-of-Mark Prompting for LMMs
OpenAdaptAI/OpenAdapt.web
OpenAdaptAI/PydanticPrompt
A simple library to document Pydantic models for structured LLM outputs using standard Python docstrings.
OpenAdaptAI/OmniParser
OpenAdaptAI/OpenSanitizer
A privacy-focused module for detecting and scrubbing PII/PHI from screen data and user actions.
OpenAdaptAI/Awesome-LLM-Reasoning
Papers and resources on Reasoning in Language Models (LLMs), including Chain-of-Thought, Instruction-Tuning, Multimodality.
OpenAdaptAI/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
OpenAdaptAI/OpenAdaptVault
Archival snapshot of OpenAdapt: Open Source Generative Process Automation (Generative RPA) with foundational AI models ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs) / Visual Language (VLMs)]). Preserves legacy features and prior implementations for reference.
OpenAdaptAI/llama-agentic-system
Agentic components of the Llama Stack APIs
OpenAdaptAI/openadapt-gitbook
OpenAdaptAI/OpenAdapter
Effortless Deployment and Integration for SOTA Screenshot Parsing and Action Models
OpenAdaptAI/OpenReflector
OpenReflector links the Anthropic Computer Use container to a Windows or Mac desktop, using OpenAdapt and WebSockets for real-time, two-way mirroring of actions and commands.
OpenAdaptAI/prismer
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
OpenAdaptAI/.github
OpenAdaptAI/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
OpenAdaptAI/app
A desktop application that enables end-users to automate their workflows with OpenAdapt
OpenAdaptAI/grok-1
Grok open release
OpenAdaptAI/Janus
Janus-Series: Unified Multimodal Understanding and Generation Models
OpenAdaptAI/OmniMCP.web
Web interface for OmniMCP.
OpenAdaptAI/omniparser-api
Self-hosted version of Microsoft's OmniParser Image-to-text model
OpenAdaptAI/open-r1-multimodal
A fork to add multimodal model training to open-r1
OpenAdaptAI/OpenCUA
OpenCUA: Open Foundations for Computer-Use Agents
OpenAdaptAI/Qwen2.5-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
OpenAdaptAI/R1-V
Witness the aha moment of VLM with less than $3.
OpenAdaptAI/Self-Rewarding-Language-Models
This is work done by the Oxen.ai Community, trying to reproduce the Self-Rewarding Language Model paper from MetaAI.
OpenAdaptAI/UI-TARS-desktop
An GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
OpenAdaptAI/ultralytics
NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
OpenAdaptAI/whisper
Robust Speech Recognition via Large-Scale Weak Supervision