rog93

rog93's Stars

llvm/torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
Language:C++1.4k515
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.7k217
PaddleJitLab/CUDATutorial
A self-learning tutorail for CUDA High Performance Programing.
Language:JavaScript29533
Tony-Tan/CUDA_Freshman
Language:Cuda2.3k444
microsoft/triton-shared
Shared Middle-Layer for Triton Compilation
Language:MLIR21448
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.7k3.2k
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
Language:Python2.5k239
VikParuchuri/surya
OCR, layout analysis, reading order, table recognition in 90+ languages
Language:Python14.9k951
facebookresearch/nougat
Implementation of Nougat Neural Optical Understanding for Academic Documents
Language:Python9.1k578
xai-org/grok-1
Grok open release
Language:Python49.8k8.3k
DefTruth/CUDA-Learn-Notes
📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
Language:Cuda1.8k188
WongKinYiu/yolov9
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Language:Python9.1k1.5k
triton-lang/triton
Development repository for the Triton language and compiler
Language:C++13.8k1.7k
mistralai/mistral-inference
Official inference library for Mistral models
Language:Jupyter Notebook9.8k871
NVIDIA-AI-IOT/CUDA-PointPillars
A project demonstrating how to use CUDA-PointPillars to deal with cloud points data from lidar.
Language:Python543159
pigirons/cpufp
A CPU tool for benchmarking the peak of floating points
Language:Assembly514126
ytongbai/LVM
Language:Python1.8k55
thuml/depyf
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
Language:Python53015
DRosemei/RoMe
Language:Python22526
pytorch/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python85.3k23k
karpathy/llama2.c
Inference Llama 2 in one file of pure C
Language:C17.6k2.1k
THUDM/ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Language:Python15.8k1.9k
bytedance/ByteMLPerf
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
Language:Python21364
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
Language:Python35.8k3.5k
THUDM/GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Language:Python7.7k608
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Language:Python40.9k5.2k
huawei-noah/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Language:Python3k628
Charmve/Surface-Defect-Detection
📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.
Language:Python3.3k540
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
27.7k2.3k
SaoYan/DnCNN-PyTorch
PyTorch implementation of the TIP2017 paper "Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising"
Language:Python413118