shijian2001's Stars
microsoft/autogen
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
karpathy/LLM101n
LLM101n: Let's build a Storyteller
unslothai/unsloth
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
princeton-vl/infinigen
Infinite Photorealistic Worlds using Procedural Generation
huggingface/diffusion-models-class
Materials for the Hugging Face Diffusion Models Course
Eladlev/AutoPrompt
A framework for prompt tuning using Intent-based Prompt Calibration
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
NJU-PCALab/RAG-Diffusion
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥
zjysteven/lmms-finetune
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.
CVMI-Lab/SyntheticData
Is synthetic data from generative models ready for image recognition?
Yushi-Hu/tifa
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
autogenhub/autogen
A programming framework for agentic AI. Discord: https://discord.gg/pAbnFJrkgZ
JieyuZ2/TaskMeAnything
[NeurIPS 2024] A task generation and model evaluation system for multimodal language models.
IntelLabs/lvlm-interpret
kkontheway/Auditor-Playground
shijian2001/VQAPromptBench
A Benchmark for VQA prompt sensitivity
LinxinS97/captain_agent_demo
Official implementation of Captain Agent
Benchmark-Dysca/Dysca
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
paulosalem/time-blender
A programmatic and compositional time series generator.
shijian2001/AttrSyn
A pipeline for attributed synthetic image generation and selection
JieyuZ2/MathVerse
JieyuZ2/scene-graph-utils
A utils library of programmactially generating caption and QAs from scene graph