Pinned Repositories
Awesome-Segment-Anything
A collection of project, papers, and source code for Meta AI's Segment Anything Model (SAM) and related studies.
cube-studio
云原生一站式机器学习平台,多租户,数据资产,notebook在线开发,拖拉拽任务流编排,多机多卡分布式训练,超参搜索,推理服务,多集群调度,多项目组资源组,边缘计算,大模型实时训练, ai应用商店
CVprojects
computer vision projects | 计算机视觉相关好玩的AI项目(Python、C++)
DLTA-AI
Data Labeling, Tracking and Annotation with AI
FastSAM
Fast Segment Anything
Fay
Fay是一个完整的开源项目,包含Fay控制器及数字人模型,可灵活组合出不同的应用场景:虚拟主播、现场推销货、商品导购、语音助理、远程语音助理、数字人互动、数字人面试官及心理测评、贾维斯、Her。 开源项目,非产品试用!!!
fay-ue5
可对接fay数字人的ue5工程
Grounded-Segment-Anything
Marrying Grounding DINO with Segment Anything & Stable Diffusion & BLIP - Automatically Detect , Segment and Generate Anything with Image and Text Inputs
video_pipe_c
a plugin-oriented framework for video structured.
ymir
YMIR, a streamlined model development product.
hsaigroup's Repositories
hsaigroup/Fay
Fay是一个完整的开源项目,包含Fay控制器及数字人模型,可灵活组合出不同的应用场景:虚拟主播、现场推销货、商品导购、语音助理、远程语音助理、数字人互动、数字人面试官及心理测评、贾维斯、Her。 开源项目,非产品试用!!!
hsaigroup/allenai_molmo_wrapper
正如你所见, allenai molmo wrapper
hsaigroup/bolt.new-any-llm
Prompt, run, edit, and deploy full-stack web applications using any LLM you want!
hsaigroup/clean-ui
Simple UI for Llama-3.2-11B-Vision & Molmo-7B-D
hsaigroup/cog-molmo-7b-d
Replicate Cog for allenai/Molmo-7B-D-0924
hsaigroup/ComfyUI-Manager
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, this extension provides a hub feature and convenience functions to access a wide range of information within ComfyUI.
hsaigroup/ComfyUI-Molmo
Generate detailed image descriptions and analysis using Molmo models in ComfyUI.
hsaigroup/ComfyUI-PixtralLlamaMolmoVision
For loading and running Pixtral models
hsaigroup/Computer-Vision-Server
An API server for Molmo 7B - Describe web pages or computer screenshots and point to elements using Molmo 7B - a multimodal vision model which can describe real and virtual images and point at objects
hsaigroup/computer_use_ootb
An out-of-the-box (OOTB) version of Anthropic Claude Computer Use for Windows and macOS
hsaigroup/DesktopAI
An common framework for voice and text interactions with LLMs
hsaigroup/Echo-Voice-Cloning-Soundboard
Echo - Voice Cloning Soundboard for Call of Duty, MW3, Black Ops,
hsaigroup/Kmars.ai_AI_Image_Analyzer
AI Image Analyzer for ollama mistral.rs molmo in Mac M2 max (Screen Capture Analyzer ,Camera Capture Analyzer)
hsaigroup/llama.cpp
LLM inference in C/C++
hsaigroup/molmo-7b-bnb-4bit
4bit bitsandbytes quants of the best 7B vlms
hsaigroup/Molmo-Colaboratory-Sample
Colaboratory上でallenai/Molmoをお試しするサンプル
hsaigroup/Molmo-Finetune
An open-source implementaion for fine-tuning Molmo-7B-D and Molmo-7B-O by allenai.
hsaigroup/molmo-serve
hsaigroup/molmo_ai
hsaigroup/molmotest
tests using molmo ai
hsaigroup/obs-urlsource
OBS plugin to fetch data from a URL or file, connect to an API or AI service, parse responses and display text, image or audio on scene
hsaigroup/Project_Miao
一起来养一只拥有专属记忆的AI猫猫吧!
hsaigroup/SAM_Molmo_Whisper
An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.
hsaigroup/scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
hsaigroup/SenseVoice
Multilingual Voice Understanding Model
hsaigroup/Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋
hsaigroup/T-Rex
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
hsaigroup/TEN-Agent
TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.
hsaigroup/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
hsaigroup/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs