zeyuanyin's Stars
princeton-nlp/MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
MrKai77/Loop
Window management made elegant.
google-deepmind/penzai
A JAX research toolkit for building, editing, and visualizing neural networks.
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
meta-llama/llama3
The official Meta Llama 3 GitHub site
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
facebookresearch/pycls
Codebase for Image Classification Research, written in PyTorch.
Farama-Foundation/Gymnasium
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
pytorch/torchtune
A Native-PyTorch Library for LLM Fine-tuning
penghao-wu/vstar
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
open-compass/MMBench
Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"
microsoft/DeepSpeedExamples
Example models using DeepSpeed
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
huggingface/trl
Train transformer language models with reinforcement learning.
xai-org/grok-1
Grok open release
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
wkentaro/gdown
Google Drive Public File Downloader when Curl/Wget Fails
tiangolo/full-stack-fastapi-template
Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.
sanbuphy/llm-vision-datasets
Collection of image and video datasets for generative AI and multimodal visual AI
TRI-ML/vlm-evaluation
VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
NVlabs/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
lucidrains/x-clip
A concise but complete implementation of CLIP with various experimental improvements from recent papers