MARD1NO's Stars
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
Oneflow-Inc/diffusers
LetheSec/HuggingFace-Download-Accelerator
利用HuggingFace的官方下载工具从镜像网站进行高速下载。
ziqi-jin/movie-yourself
A system which make movies by yourself through Large Video Generative AI model like Sora
jpli02/LandmarkConv
Efficient Convolutional Module for Semantic Understanding
QwenLM/Qwen1.5
Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.
Tele-AI/Telechat
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
pytorch/cppdocs
PyTorch C++ API Documentation
NVIDIA/nvcomp
Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.
pytorch-labs/float8_experimental
This repository contains the experimental PyTorch native float8 training UX
pprp/Vision-Mamba-CIFAR10
LeiWang1999/mlc-benchmark
LeiWang1999/cutlass
CLUEbenchmark/SuperCLUE-Math6
SuperCLUE-Math6:新一代中文原生多轮多步数学推理数据集的探索之旅
OpenNLPLab/lightning-attention
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
ColfaxResearch/cutlass-kernels
sustcsonglin/flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
horseee/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
AIoT-MLSys-Lab/Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
AXERA-TECH/pulsar2-docs-en
The docs repository of Pulsar2 which is AXera's SoC 2rd AI toolchain. Such as AX650A, AX650N, AX630C, AX620Q
openai/weak-to-strong
pytorch/torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind.
radarFudan/Awesome-state-space-models
Collection of papers on state-space models
hahnyuan/ASVD4LLM
Activation-aware Singular Value Decomposition for Compressing Large Language Models
sneaxiy/AAdiffTools
microsoft/superbenchmark
A validation and profiling tool for AI infrastructure
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Dao-AILab/causal-conv1d
Causal depthwise conv1d in CUDA, with a PyTorch interface
bobby-he/simplified_transformers