rebel-jueonpark's Stars
bytedance/flux
A fast communication-overlapping library for tensor parallelism on GPUs.
bloomberg/memray
Memray is a memory profiler for Python
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
llvm/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
triton-inference-server/tensorrtllm_backend
The Triton TensorRT-LLM Backend
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
pytorch-labs/triton-cpu
An experimental CPU backend for Triton (https//github.com/openai/triton)
tenstorrent/tt-buda
Tenstorrent TT-BUDA Repository
tenstorrent/tt-metal
:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
open-mpi/ompi
Open MPI main development repository
nod-ai/SHARK-Studio
SHARK Studio -- Web UI for SHARK+IREE High Performance Machine Learning Distribution
fmtlib/fmt
A modern formatting library
gcc-mirror/gcc
nod-ai/techtalks
facebookresearch/fairscale
PyTorch extensions for high performance and large scale training.
microsoft/triton-shared
Shared Middle-Layer for Triton Compilation
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
intel/mlir-extensions
Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.
triton-inference-server/pytorch_backend
The Triton backend for the PyTorch TorchScript models.
modularml/mojo
The Mojo Programming Language
mlc-ai/docs
The documents for TVM Unity
huggingface/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
llvm/torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
octoml/relax
A fork of tvm/unity
gabime/spdlog
Fast C++ logging library.
plaidml/plaidml
PlaidML is a framework for making deep learning work everywhere.
triton-lang/triton
Development repository for the Triton language and compiler
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
ise-uiuc/neuri-artifact
Artifact for ESEC/FSE'23 paper "NeuRI: Diversifying DNN Generation via Inductive Rule Inference"