siwooyong

siwooyong's Stars

tianyi-lab/Reflection_Tuning
[ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
Language:Python31027
affjljoo3581/GPT2
PyTorch Implementation of OpenAI GPT-2
Language:Python27756
bbaaii/DreamDiffusion
Implementation of “DreamDiffusion: Generating High-Quality Images from Brain EEG Signals”
Language:Python45555
prajwalsingh/EEG2Image
EEG2IMAGE: Image Reconstruction from EEG Brain Signals. [ICASSP 2023]
Language:Python599
Ki6an/fastT5
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Language:Python56272
Zheng-Chong/CatVTON
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
Language:Python74085
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Language:Python2.3k177
openvinotoolkit/openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Language:C++6.9k2.2k
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python1.6k196
Vahe1994/AQLM
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression https://arxiv.org/abs/2405.14852
Language:Python1.1k173
huggingface/optimum-quanto
A pytorch quantization backend for optimum
Language:Python76055
IST-DASLab/gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Language:Python1.9k150
Project-MONAI/MONAI
AI Toolkit for Healthcare Imaging
Language:Python5.7k1k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.2k910
microsoft/human-pose-estimation.pytorch
The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"
Language:Python2.9k604
pytorch/TensorRT
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Language:Python2.5k346
NVIDIA/TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, sparsity, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
Language:Python44227
openvinotoolkit/nncf
Neural Network Compression Framework for enhanced OpenVINO™ inference
Language:Python911227
pytorch/ao
PyTorch native quantization and sparsity for training and inference
Language:Python75095
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Language:C++10.6k2.1k
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Language:Python2.2k252
megvii-research/FQ-ViT
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Language:Python30248
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
Language:Jupyter Notebook9.9k817
clovaai/CutMix-PyTorch
Official Pytorch implementation of CutMix regularizer
Language:Python1.2k160
ds-wook/web-ctr-prediction
🏆1st solution in web ad CTR predict competition🏆
Language:Python12
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
Language:Python2.2k162
Locutusque/TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
Language:Jupyter Notebook21722
KindXiaoming/pykan
Kolmogorov Arnold Networks
Language:Jupyter Notebook14.6k1.3k
IRCSS/MatrixVFX
A realtime Matrix VFX Shader in Unity 3D
Language:ShaderLab48361
rrmina/fast-neural-style-pytorch
Fast Neural Style Transfer implementation in PyTorch :art: :art: :art:
Language:Python31577