JerryWei1985's Stars
datawhalechina/learn-nlp-with-transformers
we want to create a repo to illustrate usage of transformers in chinese
google-deepmind/open_x_embodiment
wdndev/llm_interview_note
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
xinntao/Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
thu-ml/SageAttention
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
ostris/ai-toolkit
Various AI scripts. Mostly Stable Diffusion stuff.
Oneflow-Inc/oneflow
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
siliconflow/onediff
OneDiff: An out-of-the-box acceleration library for diffusion models.
baaivision/Emu3
Next-Token Prediction is All You Need
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
lucidrains/autoregressive-diffusion-pytorch
Implementation of Autoregressive Diffusion in Pytorch
sihyun-yu/REPA
Official Pytorch Implementation of Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
3b1b/manim
Animation engine for explanatory math videos
GAIR-NLP/O1-Journey
O1 Replication Journey: A Strategic Progress Report – Part I
TianxingChen/Embodied-AI-Guide
具身智能中文指南
kyutai-labs/moshi
kohya-ss/sd-scripts
dvgodoy/PyTorchStepByStep
Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"
VinAIResearch/LFM
Official PyTorch implementation of the paper: Flow Matching in Latent Space
gle-bellier/flow-matching
Annotated Flow Matching paper
XLabs-AI/x-flux
Linaqruf/kohya-trainer
Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
ChaofWang/Awesome-Super-Resolution
Collect super-resolution related papers, data, repositories
dvlab-research/ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA