KateBlueSky

Pinned Repositories

cutlass-fork
CUDA Templates for Linear Algebra Subroutines
Language:C++00
FlagGems
FlagGems is an operator library for large language models implemented in Triton Language.
Language:Python00
intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Language:C++00
lc0
SYCL work
Language:C++01
llama.cpp
Port of Facebook's LLaMA model in C/C++
Language:C00
llvm
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
00
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++0 0 00
triton
Development repository for the Triton language and compiler
Language:C++00
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python00
xetla
Language:C++00

KateBlueSky's Repositories

KateBlueSky/intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Language:C++00
KateBlueSky/lc0
SYCL work
Language:C++01
KateBlueSky/llama.cpp
Port of Facebook's LLaMA model in C/C++
Language:C00
KateBlueSky/llvm
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
00
KateBlueSky/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++0 0 00
KateBlueSky/triton
Development repository for the Triton language and compiler
Language:C++00
KateBlueSky/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python00
KateBlueSky/xetla
Language:C++00