Pinned Repositories
cutlass
CUDA Templates for Linear Algebra Subroutines
DirectML
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
kubeedge
Kubernetes Native Edge Computing Framework (project under CNCF)
LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
llama.cpp
LLM inference in C/C++
lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
mlrun
Machine Learning automation and tracking
neural-speed
An innovative library for efficient LLM inference via low-bit quantization
intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
neural-speed
An innovative library for efficient LLM inference via low-bit quantization
anthony-intel's Repositories
anthony-intel/kubeedge
Kubernetes Native Edge Computing Framework (project under CNCF)
anthony-intel/llama.cpp
LLM inference in C/C++
anthony-intel/cutlass
CUDA Templates for Linear Algebra Subroutines
anthony-intel/DirectML
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
anthony-intel/LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
anthony-intel/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
anthony-intel/mlrun
Machine Learning automation and tracking
anthony-intel/neural-speed
An innovative library for efficient LLM inference via low-bit quantization