siwooyong's Stars
tianyi-lab/Reflection_Tuning
[ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
affjljoo3581/GPT2
PyTorch Implementation of OpenAI GPT-2
bbaaii/DreamDiffusion
Implementation of “DreamDiffusion: Generating High-Quality Images from Brain EEG Signals”
prajwalsingh/EEG2Image
EEG2IMAGE: Image Reconstruction from EEG Brain Signals. [ICASSP 2023]
Ki6an/fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Zheng-Chong/CatVTON
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
openvinotoolkit/openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Vahe1994/AQLM
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression https://arxiv.org/abs/2405.14852
huggingface/optimum-quanto
A pytorch quantization backend for optimum
IST-DASLab/gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Project-MONAI/MONAI
AI Toolkit for Healthcare Imaging
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
microsoft/human-pose-estimation.pytorch
The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"
pytorch/TensorRT
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
NVIDIA/TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, sparsity, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
openvinotoolkit/nncf
Neural Network Compression Framework for enhanced OpenVINO™ inference
pytorch/ao
PyTorch native quantization and sparsity for training and inference
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
megvii-research/FQ-ViT
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
clovaai/CutMix-PyTorch
Official Pytorch implementation of CutMix regularizer
ds-wook/web-ctr-prediction
🏆1st solution in web ad CTR predict competition🏆
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
Locutusque/TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
KindXiaoming/pykan
Kolmogorov Arnold Networks
IRCSS/MatrixVFX
A realtime Matrix VFX Shader in Unity 3D
rrmina/fast-neural-style-pytorch
Fast Neural Style Transfer implementation in PyTorch :art: :art: :art: