66Kevin's Stars
altair199797/LowFormer
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
wangzyon/NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
karpathy/llm.c
LLM training in simple, raw C/CUDA
flashlight/flashlight
A C++ standalone library for machine learning
Parskatt/RoMa
[CVPR 2024] RoMa: Robust Dense Feature Matching; RoMa is the robust dense feature matcher capable of estimating pixel-dense warps and reliable certainties for almost any image pair.
ulrichstern/cuda-convnet
Alex Krizhevsky's original code from Google Code
fabio-sim/LightGlue-ONNX
ONNX-compatible LightGlue: Local Feature Matching at Light Speed. Supports TensorRT, OpenVINO
bytedance/effective_transformer
Running BERT without Padding
ggerganov/llama.cpp
LLM inference in C/C++
verlab/accelerated_features
Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place!
pybind/pybind11
Seamless operability between C++11 and Python
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
NVIDIA/MatX
An efficient C++17 GPU numerical computing library with Python-like syntax
DaiShiResearch/TransNeXt
[CVPR 2024] Code release for TransNeXt model
mit-han-lab/efficientvit
EfficientViT is a new family of vision models for efficient high-resolution vision.
cuda-mode/resource-stream
CUDA related news and material links
bug-developer021/YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
triple-Mu/YOLOv8-TensorRT
YOLOv8 using TensorRT accelerate !
zjhellofss/KuiperInfer
校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
torchpipe/torchpipe
Serving Inside Pytorch
yzhao062/anomaly-detection-resources
Anomaly detection related books, papers, videos, and toolboxes
openvinotoolkit/anomalib
An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
apple/ml-fastvit
This repository contains the official implementation of the research paper, "FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization" ICCV 2023
kornia/kornia
Geometric Computer Vision Library for Spatial AI
Lightning-AI/torchmetrics
Torchmetrics - Machine learning metrics for distributed, scalable PyTorch applications.
Tlntin/trt2023
daquexian/onnx-simplifier
Simplify your onnx model