Pinned Repositories
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
onnx
Open standard for machine learning interoperability
Amuse
.NET application for stable diffusion, Leveraging OnnxStack, Amuse seamlessly integrates many StableDiffusion capabilities all within the .NET eco-system
bert
TensorFlow code and pre-trained models for BERT
ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
CNTK
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
DemoFusion
Let us democratise high-resolution generation! (arXiv 2023)
diffusers
🤗 Diffusers: experiment of diffusion ONNX models
Faster-Diffusion
Stable-Diffusion-WebUI-OnnxRuntime
Extension for Automatic1111's Stable Diffusion WebUI, using OnnxRuntime CUDA execution provider to deliver high performance result on Nvidia GPU.
tianleiwu's Repositories
tianleiwu/Amuse
.NET application for stable diffusion, Leveraging OnnxStack, Amuse seamlessly integrates many StableDiffusion capabilities all within the .NET eco-system
tianleiwu/Stable-Diffusion-WebUI-OnnxRuntime
Extension for Automatic1111's Stable Diffusion WebUI, using OnnxRuntime CUDA execution provider to deliver high performance result on Nvidia GPU.
tianleiwu/bert
TensorFlow code and pre-trained models for BERT
tianleiwu/ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
tianleiwu/CNTK
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
tianleiwu/DemoFusion
Let us democratise high-resolution generation! (arXiv 2023)
tianleiwu/diffusers
🤗 Diffusers: experiment of diffusion ONNX models
tianleiwu/Faster-Diffusion
tianleiwu/gdrivedl
Google Drive Download Python Script
tianleiwu/inference
Reference implementations of inference benchmarks
tianleiwu/onnx
Open Neural Network Exchange
tianleiwu/segment-anything
ONNX Runtime support for SAM
tianleiwu/TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
tianleiwu/transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
tianleiwu/tutorials
Tutorials for creating and using ONNX models
tianleiwu/libflash_attn
Standalone Flash Attention v2 kernel without libtorch dependency
tianleiwu/onnx-modifier
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
tianleiwu/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
tianleiwu/optimum
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
tianleiwu/OrtMultiThreadCSharp
Test ORT with multiple threading
tianleiwu/TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
tianleiwu/unsloth
2-5X faster 70% less memory QLoRA & LoRA finetuning