saberlililily's Stars
microsoft/BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
neuralmagic/AutoFP8
Azure/MS-AMP
Microsoft Automatic Mixed Precision Library
OpenPPL/ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
usyd-fsalab/fp6_llm
An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).
lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
project-numina/aimo-progress-prize
NVIDIA/TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
DD-DuDa/TensorRT-in-Action
TensorRT-in-Action 是一个 GitHub 代码库,提供了使用 TensorRT 的代码示例,并有对应 Jupyter Notebook。
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
IntelLabs/FP8-Emulation-Toolkit
PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.
DefTruth/CUDA-Learn-Notes
📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attention-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS 🎉🎉).
aredden/flux-fp8-api
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
itayhubara/BinaryNet.pytorch
Binarized Neural Network (BNN) for pytorch
cooooorn/Pytorch-XNOR-Net
XNOR-Net, with binary gemm and binary conv2d kernels, support both CPU and GPU.
jiecaoyu/XNOR-Net-PyTorch
PyTorch Implementation of XNOR-Net
peiswang/SiBNN
awai54st/LUTNet
WangXuan95/BSV_Tutorial_cn
一篇全面的 Bluespec SystemVerilog (BSV) 中文教程,介绍了BSV的调度、FIFO数据流、多态等高级特性,展示了BSV相比于传统Verilog开发的优势。
rayleizhu/vllm-ra
[ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts
kyspyridon/FP_Adder
Designed, using Verilog, a single cycle and a 2-stage pipelined version of a Floating Point Adder according to the IEEE-754 format. This project is designed to target a Xilinx Zedboard. To test our implementation on the actual hardware, we used detachable 7-segment displays.
shahsaumya00/Floating-Point-Adder
32 bit pipelined binary floating point adder using IEEE-754 Single Precision Format in Verilog
erihsu/INT_FP_MAC
INT8 & FP16 multiplier accumulator (MAC) design with UVM verification completed.
DongbeomSon/fp16MAC
JulianKemmerer/PipelineC
A C-like hardware description language (HDL) adding high level synthesis(HLS)-like automatic pipelining as a language construct/compiler feature.
google/xls
XLS: Accelerated HW Synthesis
dawsonjon/fpu
synthesiseable ieee 754 floating point library in verilog
robfinch/Float
Floating point code in System Verilog
openhwgroup/cvfpu
Parametric floating-point unit with support for standard RISC-V formats and operations as well as transprecision formats.