pmixer's Stars
NVIDIA/open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
CVCUDA/CV-CUDA
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
umlet/umlet
Free UML Tool for Fast UML Diagrams
HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
kourgeorge/arxiv-style
A Latex style and template for paper preprints (based on NIPS style)
triton-inference-server/tensorrtllm_backend
The Triton TensorRT-LLM Backend
Tlntin/Qwen-TensorRT-LLM
bytedance/effective_transformer
Running BERT without Padding
pmixer/SASRec.pytorch
PyTorch(1.6+) implementation of https://github.com/kang205/SASRec
daadaada/turingas
Assembler for NVIDIA Volta and Turing GPUs
NVIDIA/GMAT
A toolkit showing GPU's all-round capability in video processing
TrojanXu/onnxparser-trt-plugin-sample
A sample for onnxparser working with trt user defined plugins for TRT7.0
stasi009/NumpyWDL
Implement Wide & Deep algorithm by using NumPy
YellowOldOdd/SDBI
Simple Dynamic Batching Inference
chaytonmin/DeepMVS
3D reconstruction project with MVSNets for depth inferring.
NVIDIA-Merlin/HierarchicalKV
HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on high-bandwidth memory (HBM) of GPUs and in host memory. It also can be used as a generic key-value storage.
megvii-research/TreeEnergyLoss
[CVPR2022] Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation
caojiangxia/BiGI
[WSDM 2021]Bipartite Graph Embedding via Mutual Information Maximization
yuekaizhang/Triton-ASR-Client
ASR client for Triton ASR Service
claws-lab/petgen
A PyTorch implementation of the ACM SIGKDD 2021 paper titled "PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models"
leimao/ONNX-Python-Examples
ONNX Python Examples
EdVince/whisper-trtllm
Whisper in TensorRT-LLM
claws-lab/DAIN
Code for the ACM CIKM 2021 paper "Influence-guided Data Augmentation for Neural Tensor Completion"
cunxi1992/turtle_American_shield
Turtle(海龟)作图教程,并画两个漂亮的图案,美国队长盾牌和360个正方形组成的图案。
TrojanXu/GTC_S21736_materials
Extended materials for GTC2020 talk S21736
pmixer/TiSASRec.debug
Based on https://github.com/JiachengLi1995/TiSASRec, replace negative sampling based evaluation with all-item based evaluation and try to make it better for ranking all items.
DC-Shi/cudaNppSample