sjjeong94's Stars
bayesian-optimization/BayesianOptimization
A Python implementation of global optimization with gaussian processes.
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
nicksypark/rope-triton
AGI-Edgerunners/LLM-Agents-Papers
A repo lists papers related to LLM based agent
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
haoliuhl/ringattention
Transformers with Arbitrarily Large Context
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
unslothai/unsloth
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
jcpeterson/openwebtext
Open clone of OpenAI's unreleased WebText dataset scraper. This version uses pushshift.io files instead of the API for speed.
Lyken17/pytorch-OpCounter
Count the MACs / FLOPs of your PyTorch model.
karpathy/ng-video-lecture
meta-llama/llama
Inference code for Llama models
rayleizhu/BiFormer
[CVPR 2023] Official code release of our paper "BiFormer: Vision Transformer with Bi-Level Routing Attention"
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
veritross/studiosr
PyTorch library to accelerate super-resolution research
Zdafeng/SwinFIR
microsoft/Llama-2-Onnx
XPixelGroup/HAT
CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer Arxiv - HAT: Hybrid Attention Transformer for Image Restoration
zhengchen1999/NTIRE2023_ImageSR_x4
Solution of the NTIRE 2023 Challenge on Image Super-Resolution (x4)
cszn/KAIR
Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR
JingyunLiang/VRT
VRT: A Video Restoration Transformer (official repository)
Tangshitao/MVDiffusion
MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion, NeurIPS 2023 (spotlight)
LeapLabTHU/DAT
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
karpathy/llama2.c
Inference Llama 2 in one file of pure C
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
ggerganov/llama.cpp
LLM inference in C/C++
LAION-AI/Open-Assistant
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.