intellinjun

Pinned Repositories

lm-evaluation-harness
A framework for few-shot evaluation of language models.
Language:Python7.3k 39 1.2k2k
intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Language:Python2.1k 28 166211
neural-speed
An innovative library for efficient LLM inference via low-bit quantization
Language:C++349 8 4738
intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Language:C++0 0 00
llama.cpp
Port of Facebook's LLaMA model in C/C++
Language:C0 0 00
neural-compressor
Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
Language:Python0 0 00
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python0 0 00
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python85.3k 1.7k 47.8k23k

intellinjun's Repositories

intellinjun/intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Language:C++0 0 00
intellinjun/llama.cpp
Port of Facebook's LLaMA model in C/C++
Language:C0 0 00
intellinjun/neural-compressor
Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
Language:Python0 0 00
intellinjun/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python0 0 00