davidqiu1993
Founder @ Stealth Startup (Robotics + AGI)
Stealth Startup (Robotics + AGI)Guangzhou, China
davidqiu1993's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
oobabooga/text-generation-webui
A Gradio web UI for Large Language Models.
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
2noise/ChatTTS
A generative speech model for daily dialogue.
rasbt/LLMs-from-scratch
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
Farama-Foundation/Gymnasium
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
GT-RIPL/Awesome-LLM-Robotics
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
z-x-yang/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
KevinWang676/Bark-Voice-Cloning
Bark Voice Cloning and Voice Cloning for Chinese Speech
LLaVA-VL/LLaVA-NeXT
JakobEngel/dso
Direct Sparse Odometry
google-research/big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
jianchang512/stt
Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务,输出json、srt字幕带时间戳、纯文字格式
petercorke/robotics-toolbox-python
Robotics Toolbox for Python
uzh-rpg/rpg_svo
Semi-direct Visual Odometry
zubair-irshad/Awesome-Implicit-NeRF-Robotics
A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain, including papers, codes, and related websites
ChenyangQiQi/FateZero
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
leggedrobotics/ocs2
Optimal Control for Switched Systems
OpenRobotLab/EmbodiedScan
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
seasalt-ai/snowboy
DNN based hotword and wake word detection toolkit (model generation included)
vlmaps/vlmaps
[ICRA2023] Implementation of Visual Language Maps for Robot Navigation
2Dou/watermarker
使用python脚本为图片添加文字水印
allenai/procthor
🏘️ Scaling Embodied AI by Procedurally Generating Interactive 3D Houses
leggedrobotics/perceptive_mpc
Code for "Perceptive Model Predictive Control for Continuous Mobile Manipulation"
shiyoung77/OVIR-3D
This is the official repository for OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D Data. (CoRL'23)
licksylick/AutoTrackAnything
AutoTrackAnything is a universal, flexible and interactive tool for insane automatic object tracking over thousands of frames. It is developed upon XMem, Yolov8 and MobileSAM (Segment Anything), can track anything which detect Yolov8.
fishmarch/ORB-SLAM3-Dense