jiaojile

jiaojile's Stars

siyuanliii/masa
Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything
Language:Python1.1k72
Facico/Chinese-Vicuna
Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca
Language:C4.2k418
lllyasviel/ControlNet-v1-1-nightly
Nightly release of ControlNet 1.1
Language:Python4.8k382
CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook68.9k10.2k
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Language:Python26.9k5.5k
GAIR-NLP/O1-Journey
O1 Replication Journey: A Strategic Progress Report – Part I
1.8k54
yfzhang114/Awesome-Multimodal-Large-Language-Models
Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models
2237
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python10.9k2.4k
DefTruth/CUDA-Learn-Notes
📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
Language:Cuda1.8k189
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python36.9k4.5k
advimman/lama
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Language:Jupyter Notebook8.3k876
Mikubill/sd-webui-controlnet
WebUI extension for ControlNet
Language:Python17.2k2k
exx8/differential-diffusion
Language:Python41423
noahcao/OC_SORT
[CVPR2023] The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.
Language:Python808110
THUDM/CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
Language:Python2.2k148
Holistic-Motion2D/Tender
The official code for Tender
38
EvolvingLMMs-Lab/lmms-eval
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
Language:Python2.2k177
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python6.6k514
ShareGPT4Omni/ShareGPT4Video
[NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Language:Python1.3k44
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Language:Python1.6k223
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Language:Python4.4k464
mbzuai-oryx/VideoGPT-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
Language:Python23415
Ahnsun/merlin
[ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds
Language:Python87
DAMO-NLP-SG/VideoLLaMA2
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Language:Python98265
BradyFU/Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
43018
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.3k840
Alpha-VLLM/LLaMA2-Accessory
An Open-source Toolkit for LLM Development
Language:Python2.7k176
magic-research/PLLaVA
Official repository for the paper PLLaVA
Language:Python62245
JialianW/GRiT
GRiT: A Generative Region-to-text Transformer for Object Understanding (https://arxiv.org/abs/2212.00280)
Language:Python30830
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Language:Python3.2k281