Siri-2001

Siri-2001's Stars

labmlai/annotated_deep_learning_paper_implementations
🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Language:Python55.3k 452 1325.7k
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Language:Python25.4k 218 4612.9k
Mintplex-Labs/anything-llm
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
Language:JavaScript25k 196 1.6k2.5k
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python19.9k 156 1.5k2.2k
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
Language:Python12k 122 7081k
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
Language:HTML9.8k 82 21962
xszyou/Fay
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guides, broadcasters, assistants, waiters, teachers, and voice or text-based mobile assistants.
9k 110 1111.8k
ymcui/Chinese-LLaMA-Alpaca-2
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Language:Python7.1k 79 389580
lonePatient/awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合
Language:Python4.8k 91 12473
DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Language:Python2.8k 32 157255
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Language:Python1.3k 53 3199
Jamie-Stirling/RetNet
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
Language:Python1.2k 13 26100
jin-s13/COCO-WholeBody
ECCV2020 paper "Whole-Body Human Pose Estimation in the Wild"
Language:Python758 25 4472
cschenxiang/DRSformer
Learning A Sparse Transformer Network for Effective Image Deraining (CVPR 2023)
Language:Python260 6 2514
wholebody3d/wholebody3d
Official repository of Human3.6M 3D WholeBody (H3WB) dataset
Language:Python249 10 348
syncdoth/RetNet
Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent, and chunkwise forward.
Language:Jupyter Notebook226 5 3124
woodfrog/floor-sp
Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path, ICCV 2019
Language:Python134 9 1827
Awexander/audioWhisper
Listen to any audio stream on your machine and print out the transcribed or translated audio.
Language:Python111 4 614
junyangwang0410/AMBER
An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation
Language:Python92 1 32
udon-universe/stable-diffusion-webui-extension-templates
a template of stable-diffusion-webui extension
Language:Python72 3 322
wanng-ide/VQA_to_multimodal_survey
Update 2020
70 2 06
Holipori/MIMIC-Diff-VQA
Language:Python52 1 34
DrZiji/VecFloorSeg
Source code repo for VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation
Language:Python38 5 76
imatge-upc/slt_how2sign_wicv2023
Sign Language Translation for Instructional Videos - CVPR WiCV 2023
Language:Python33 10 1210
sharif1093/py_floor_plan_segmenter
A Python package to segment cluttered 2D floor plans based on down-sampling.
Language:Python27 3 22
heyudage/VoiceTyping
通过语音（说话）即可完成实时文本输入。通过PaddleSpeech项目二次开发完成，支持离线脱网环境部署，支持GPU推理，目前客户端仅支持Windows。
Language:Python25 1 13
ByZ0e/Glance-Focus
This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)
Language:Python21 2 106
bipashasen/INR-V-VideoGenerationSpace
The Official Implementation for INR-V: A Continuous Representation Space for Video-based Generative Tasks
Language:Python14 2 11
sor8sh/Room-Segmentation
Automatic Room Segmentation
Language:Python14 1 02
wangcunxiang/Graph-aS-Tokens
Language:Python8 2 01