Tigerdwgth

Tigerdwgth's Stars

littlecodersh/ItChat
A complete and graceful API for Wechat. 微信个人号接口、微信机器人及命令行微信，三十行即可自定义个人号机器人。
Language:Python25.7k5.6k
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Language:Python12.7k868
GAIR-NLP/O1-Journey
O1 Replication Journey: A Strategic Progress Report – Part I
1.5k43
lfranke/vr_splatting
9
shibhansh/loss-of-plasticity
Demonstrations of Loss of Plasticity and Implementation of Continual Backpropagation
Language:Python21241
octo-models/octo
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
Language:Python914169
jayLEE0301/vq_bet_official
Official code for "Behavior Generation with Latent Actions" (ICML 2024 Spotlight)
Language:Python11513
google-deepmind/open_x_embodiment
Language:Jupyter Notebook89762
tonyzhaozh/act
Language:Python784187
Shaka-Labs/ACT
Action Chunking Transformer implementation for low cost robot
Language:Jupyter Notebook21032
mlzxy/arp
Autoregressive Policy for Robot Learning
Language:Python805
CleanDiffuserTeam/CleanDiffuser
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making
Language:Jupyter Notebook40237
YanjieZe/3D-Diffusion-Policy
[RSS 2024] 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
Language:Python54952
bytedance/GR-1
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"
Language:Python1919
jy0205/LaVIT
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
Language:Jupyter Notebook54329
tyshiwo1/Accelerating-T2I-AR-with-SJD
Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Language:Python25
Alpha-VLLM/Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
Language:Python50722
apoorvumang/prompt-lookup-decoding
Language:Jupyter Notebook47323
hyx1999/SAM-Decoding
Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton
Language:Python111
FasterDecoding/REST
REST: Retrieval-Based Speculative Decoding, NAACL 2024
Language:C17911
cvg/NoPoSplat
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
Language:Python52319
collabora/WhisperFusion
WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
Language:Python1.6k111
haksorus/gsplatloc
GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization
Language:Jupyter Notebook653
AudioLLMs/AudioLLM
Audio Large Language Models
1498
fixie-ai/ultravox
A fast multimodal LLM for real-time voice
Language:Python1.5k96
THUDM/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
Language:Python2.4k189
dcharatan/pixelsplat
[CVPR 2024 Oral, Best Paper Runner-Up] Code for "pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction" by David Charatan, Sizhe Lester Li, Andrea Tagliasacchi, and Vincent Sitzmann
Language:Python91963
Lightning-AI/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Language:Python10.8k1.1k
TEN-framework/TEN-Agent
TEN Agent is a world-class multimodal AI agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG.
Language:Python1.9k213
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Language:Python2.6k177