ymote's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
open-webui/open-webui
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
2noise/ChatTTS
A generative speech model for daily dialogue.
OpenBMB/ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
facebookresearch/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
naver/dust3r
DUSt3R: Geometric 3D Vision Made Easy
frdel/agent-zero
Agent Zero AI framework
huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
OpenNMT/CTranslate2
Fast inference engine for Transformer models
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
davabase/whisper_real_time
Real time transcription with OpenAI Whisper.
nerfstudio-project/gsplat
CUDA accelerated rasterization of gaussian splatting
ufal/whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
naver/mast3r
Grounding Image Matching in 3D with MASt3R
MeetKai/functionary
Chat language model that can use tools and interpret the results
security-union/videocall-rs
teleconference system written in rust
merveenoyan/smol-vision
Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜
xverse-engine/XV3DGS-UEPlugin
A Unreal Engine 5 (UE5) based plugin aiming to provide real-time visulization, management, editing, and scalable hybrid rendering of Guassian Splatting model.
ngxson/wllama
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
Sharrnah/whispering
Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications
OutofAi/2D-Gaussian-Splatting
A 2D Gaussian Splatting paper for no obvious reasons. Enjoy!
pointrix-project/pointrix
A differentiable point-based rendering framework.
zyc00/Point-SAM
Point-SAM: This is the official repository of "Point-SAM: Promptable 3D Segmentation Model for Point Clouds". We provide codes for running our demo and links to download checkpoints.
xlang-foundation/xlang
A next-generation dynamic and high-performance language for AI and IOT with natural born distributed computing ability.
flomesh-io/fsm
Lightweight service mesh for Kubernetes East-West and North-South traffic management, uses ebpf for layer4 and pipy proxy for layer7 traffic management, support multi cluster network.
OpenGVLab/EgoExoLearn
[CVPR 2024] Data and benchmark code for the EgoExoLearn dataset
palpo-matrix-server/palpo
openwallet-foundation-labs/tsp