xrsrke's Stars
meta-llama/llama3
The official Meta Llama 3 GitHub site
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
pytorch/ao
PyTorch native quantization and sparsity for training and inference
mit-han-lab/smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
pytorch/FBGEMM
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
google-research/federated
A collection of Google research projects related to Federated Learning and Federated Analytics.
ironjr/grokfast
Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
FranxYao/Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
apple/ml-sigma-reparam
usyd-fsalab/fp6_llm
An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).
pytorch-labs/float8_experimental
This repository contains the experimental PyTorch native float8 training UX
pytorch-labs/applied-ai
Applied AI experiments and examples for PyTorch
nbasyl/LLM-FP4
The official implementation of the EMNLP 2023 paper LLM-FP4
jundaf2/INT8-Flash-Attention-FMHA-Quantization
albanD/subclass_zoo
Qualcomm-AI-research/FP8-quantization
athms/mad-lab
A MAD laboratory to improve AI architecture designs 🧪
wimh966/outlier_suppression
The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
ROCm/aotriton
Ahead of Time (AOT) Triton Math Library
google-deepmind/asyncdiloco
Qualcomm-AI-research/outlier-free-transformers
arogozhnikov/adamw_bfloat16
AdamW optimizer for bfloat16 models in pytorch 🔥.
thu-ml/Jetfire-INT8Training
graphcore-research/pytorch-tensor-tracker
Flexibly track outputs and grad-outputs of torch.nn.Module.
AmericanPresidentJimmyCarter/test-torch-bfloat16-vit-training
honglu2875/thing
Catch your tensors in one program and quietly send to another live python session.
carsonpo/octomul
Reasonably fast (compared to cublas) and relatively simple int8 tensor core gemm
huggingface/bench_cluster
Krishnateja244/Vanishing_Gradient
This repository helps in understanding vanishing gradient problem with visualization