rog93's Stars
llvm/torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
PaddleJitLab/CUDATutorial
A self-learning tutorail for CUDA High Performance Programing.
Tony-Tan/CUDA_Freshman
microsoft/triton-shared
Shared Middle-Layer for Triton Compilation
meta-llama/llama3
The official Meta Llama 3 GitHub site
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
VikParuchuri/surya
OCR, layout analysis, reading order, table recognition in 90+ languages
facebookresearch/nougat
Implementation of Nougat Neural Optical Understanding for Academic Documents
xai-org/grok-1
Grok open release
DefTruth/CUDA-Learn-Notes
📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
WongKinYiu/yolov9
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
triton-lang/triton
Development repository for the Triton language and compiler
mistralai/mistral-inference
Official inference library for Mistral models
NVIDIA-AI-IOT/CUDA-PointPillars
A project demonstrating how to use CUDA-PointPillars to deal with cloud points data from lidar.
pigirons/cpufp
A CPU tool for benchmarking the peak of floating points
ytongbai/LVM
thuml/depyf
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
DRosemei/RoMe
pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
karpathy/llama2.c
Inference Llama 2 in one file of pure C
THUDM/ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
bytedance/ByteMLPerf
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
THUDM/GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
huawei-noah/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Charmve/Surface-Defect-Detection
📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
SaoYan/DnCNN-PyTorch
PyTorch implementation of the TIP2017 paper "Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising"