WasedaMagina

WasedaMagina's Stars

vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python32.7k 271 5.7k5k
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python20.9k 158 1.6k2.3k
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python16.8k 111 1.1k1.7k
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Language:Jupyter Notebook13.9k 98 181.1k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.3k 257 128840
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
Language:Python7.2k 66 71554
CompVis/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
Language:Jupyter Notebook5.9k 75 2211.2k
yizhongw/self-instruct
Aligning pretrained language models with instruction data generated by themselves.
Language:Python4.2k 57 19491
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Language:Python4.1k 36 546323
jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
2.7k 125 10227
EvolvingLMMs-Lab/lmms-eval
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
Language:Python2.2k 4 235178
baaivision/Emu
Emu Series: Generative Multimodal Models from BAAI
Language:Python1.7k 22 8885
apple/ml-4m
4M: Massively Multimodal Masked Modeling
Language:Python1.6k 34 2699
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Language:Python1.4k 21 6857
markus-perl/ffmpeg-build-script
The FFmpeg build script provides an easy way to build a static FFmpeg on OSX and Linux with non-free codecs included.
Language:Shell1.1k 37 159335
tencent-ailab/persona-hub
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
Language:Python954 19 965
luogen1996/LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
Language:Python512 6 4138
soCzech/TransNetV2
TransNet V2: Shot Boundary Detection Neural Network
Language:Python512 9 4791
jianghaojun/Awesome-Parameter-Efficient-Transfer-Learning
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
393 21 425
mira-space/MiraData
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
Language:Python375 13 1910
thuanz123/enhancing-transformers
An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch
Language:Python293 8 2135
IDEA-Research/MotionLLM
[Arxiv-2024] MotionLLM: Understanding Human Behaviors from Human Motions and Videos
Language:Python270 4 139
ZrrSkywalker/MathVerse
[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Language:Python156 7 812
VT-NLP/MultiInstruct
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
Language:Python135 7 55
icoz69/StableLLAVA
Official repo for StableLLAVA
Language:Python93 4 310
janghyuncho/DECOLA
Code release for "Language-conditioned Detection Transformer"
Language:Python85 3 114
yuangpeng/dreambench_plus
Official code implementation of DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Language:Python84 1 01
longvideobench/LongVideoBench
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
Language:Python82 0 82
jihaonew/MM-Instruct
MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Language:Python31 2 20
CLUEbenchmark/SuperCLUE-Role
SuperCLUE-Role中文原生角色扮演测评基准
24 3 21