sleepwalkeryw

sleepwalkeryw's Stars

facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook47.2k 305 6645.6k
lllyasviel/Fooocus
Focus on prompting and generating
Language:Python40.9k 314 1.5k5.7k
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python31.7k 185 5383.4k
deepinsight/insightface
State-of-the-art 2D and 3D Face Analysis Project
Language:Python23.2k 514 2.5k5.4k
HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Language:JavaScript18.9k 177 2.2k2.4k
cvat-ai/cvat
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Language:Python12.4k 184 4.2k3k
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python12.3k 101 556864
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python11.1k 181 1.9k1.8k
THU-MIG/yolov10
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Language:Python9.8k 49 408949
01-ai/Yi
A series of large language models trained from scratch by developers @01-ai
Language:Jupyter Notebook7.6k 106 291471
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python4.9k 49 442374
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Language:Python4.9k 31 520413
andrewyng/translation-agent
Language:Python4.7k 51 15538
PaddlePaddle/PaddleRec
Recommendation Algorithm大规模推荐算法库，包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM，DSIN，SIGN，IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、ESCMM, MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、DMR、GateNet、NAML、DIFM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、Fibinet、ListWise、DeepRec、ENSFM，TiSAS，AutoFIS等，包含经典推荐系统数据集criteo 、movielens等
Language:Python4.3k 194 217721
THUDM/VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
Language:Python4.1k 41 350416
AiuniAI/Unique3D
Official implementation of Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
Language:Python3k 36 102233
modelscope/swift
ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
Language:Python2.7k 19 758245
TMElyralab/MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Language:Python2.6k 49 191319
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Language:Python2.5k 43 387154
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Language:Python2k 19 46192
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Language:Python1.7k 21 66113
apple/ml-4m
4M: Massively Multimodal Masked Modeling
Language:Python1.6k 33 2192
ShareGPT4Omni/ShareGPT4Video
[NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Language:Python1.3k 32 3644
Fictionarry/ER-NeRF
[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
Language:Python1k 16 164134
microsoft/RecAI
Bridging LLM and Recommender System.
Language:Jupyter Notebook559 13 1152
docker/welcome-to-docker
Language:JavaScript442 5 61.5k
fpgaminer/joytag
The JoyTag Image Tagging Model
Language:Python424 14 1726
SkyworkAI/Vitron
A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Language:Python296 12 1519
yuhaozhang7/NGD-SLAM
NGD-SLAM: Towards Real-Time Dynamic SLAM without GPU.
Language:C++92 3 413
zhusleep/fastbm25
The fast python bm25 algorithm implemented with reverted index
Language:Python43 1 111