tairenpiao

Research Engineer at Nota Inc.

@nota-githubSeoul, Korea

tairenpiao's Stars

Xilinx/brevitas
Brevitas: neural network quantization in PyTorch
Language:Python1.2k197
intel/auto-round
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
Language:Python24720
huggingface/optimum-quanto
A pytorch quantization backend for optimum
Language:Python82261
ollama/ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Language:Go97.9k7.8k
NVIDIA/TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
Language:Python55742
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.1k3.1k
Nota-NetsPresso/netspresso-trainer
A library for training, compressing and deploying computer vision models (including ViT) with edge devices
Language:Python626
OpenBMB/ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
Language:Shell25.6k3.2k
ZhangGe6/onnx-modifier
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
Language:JavaScript1.3k165
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Language:C++14.7k2.9k
daquexian/onnx-simplifier
Simplify your onnx model
Language:C++3.9k383
bazelbuild/bazel
a fast, scalable, multi-language and extensible build system
Language:Java23.2k4.1k
django/django
The Web framework for perfectionists with deadlines.
Language:Python80.9k31.8k
quic/aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Language:Python2.1k383
mlc-ai/mlc-llm
Universal LLM Deployment Engine with ML Compilation
Language:Python19.2k1.6k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.7k988
PINTO0309/spo4onnx
Simple tool for partial optimization of ONNX. Further optimize some models that cannot be optimized with onnx-optimizer and onnxsim by several tens of percent. In particular, models containing Einsum and OneHot.
Language:Python19
PINTO0309/onnx2tf
Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.
Language:Python70373
Tencent/ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Language:C++20.5k4.2k
lutzroeder/netron
Visualizer for neural network, deep learning and machine learning models
Language:JavaScript28.1k2.8k
rust-lang/rust
Empowering everyone to build reliable and efficient software.
Language:Rust98.5k12.7k
tensorflow/tensorflow
An Open Source Machine Learning Framework for Everyone
Language:C++186k74.3k
apache/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python11.8k3.5k
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++5.7k970
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.9k893
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++13.4k1.6k
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Language:Python6.7k685
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook47.6k5.6k
optuna/optuna
A hyperparameter optimization framework
Language:Python10.9k1k
microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Language:Python14.1k1.8k

tairenpiao

tairenpiao's Stars

Xilinx/brevitas

intel/auto-round

huggingface/optimum-quanto

ollama/ollama

NVIDIA/TensorRT-Model-Optimizer

meta-llama/llama3

Nota-NetsPresso/netspresso-trainer

OpenBMB/ChatDev

ZhangGe6/onnx-modifier

microsoft/onnxruntime

daquexian/onnx-simplifier

bazelbuild/bazel

django/django

quic/aimet

mlc-ai/mlc-llm

NVIDIA/TensorRT-LLM

PINTO0309/spo4onnx

PINTO0309/onnx2tf

Tencent/ncnn

lutzroeder/netron

rust-lang/rust

tensorflow/tensorflow

apache/tvm

NVIDIA/cutlass

NVIDIA/FasterTransformer

triton-lang/triton

IDEA-Research/GroundingDINO

facebookresearch/segment-anything

optuna/optuna

microsoft/nni