MARD1NO

Paddle very good | I still feel you here

SiliconFlowNeverland

MARD1NO's Stars

QwenLM/qwen.cpp
C++ implementation of Qwen-LM
Language:C++48939
apple/ml-ferret
Language:Python7.9k463
RulinShao/LightSeq
Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
Language:Python1557
pytorch/torchdistx
Torch Distributed Experimental
Language:Python11331
Oneflow-Inc/faster-chatglm-6b
Language:Python6
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Language:Python6.3k354
SkunkworksAI/hydra-moe
Language:Python40315
dlsyscourse/hw2
Language:Python39
AlibabaResearch/flash-llm
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
Language:Cuda14911
facebookresearch/fairseq2
FAIR Sequence Modeling Toolkit 2
Language:Python59255
krahets/hello-algo
《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新，English version ongoing
Language:Java76.9k9.7k
huawei-noah/Efficient-Computing
Efficient computing methods developed by Huawei Noah's Ark Lab
Language:Jupyter Notebook1.1k198
Mellanox/nccl-rdma-sharp-plugins
RDMA and SHARP plugins for nccl library
Language:C14128
mlc-ai/mlc-ai.github.io
Language:HTML51
Azure/msccl-executor-nccl
Language:C++246
minitorch/quizzes
Class quizzes for minitorch and an auto-grader.
Language:Python1
irfanICMLL/structure_knowledge_distillation
The official code for the paper 'Structured Knowledge Distillation for Semantic Segmentation'. (CVPR 2019 ORAL) and extension to other tasks.
Language:Python689103
leptonai/leptonai
A Pythonic framework to simplify AI service building
Language:Python2.5k159
punica-ai/punica
Serving multiple LoRA finetuned LLM as one
Language:Python85638
ziplab/efficient-stable-diffusion
16
TIGER-AI-Lab/MAmmoTH
Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)
Language:Jupyter Notebook28538
THUDM/MathGLM
Official Pytorch Implementation for MathGLM
Language:Python31325
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Language:Jupyter Notebook1.9k125
getgridea/gridea
✍️ A static blog writing client (一个静态博客写作客户端)
Language:TypeScript9.9k791
wjakob/nanobind
nanobind: tiny and efficient C++/Python bindings
Language:C++2.1k153
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
Language:Python4k278
bojone/bytepiece
更纯粹、更高压缩率的Tokenizer
Language:Python40818
openppl-public/ppl.llm.kernel.cuda
Language:C++11824
softmax1/Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
Language:Python645
yester31/Cutlass_EX
study of cutlass
Language:Cuda174

MARD1NO

MARD1NO's Stars

QwenLM/qwen.cpp

apple/ml-ferret

RulinShao/LightSeq

pytorch/torchdistx

Oneflow-Inc/faster-chatglm-6b

mit-han-lab/streaming-llm

SkunkworksAI/hydra-moe

dlsyscourse/hw2

AlibabaResearch/flash-llm

facebookresearch/fairseq2

krahets/hello-algo

huawei-noah/Efficient-Computing

Mellanox/nccl-rdma-sharp-plugins

mlc-ai/mlc-ai.github.io

Azure/msccl-executor-nccl

minitorch/quizzes

irfanICMLL/structure_knowledge_distillation

leptonai/leptonai

punica-ai/punica

ziplab/efficient-stable-diffusion

TIGER-AI-Lab/MAmmoTH

THUDM/MathGLM

FasterDecoding/Medusa

getgridea/gridea

wjakob/nanobind

baichuan-inc/Baichuan2

bojone/bytepiece

openppl-public/ppl.llm.kernel.cuda

softmax1/Flash-Attention-Softmax-N

yester31/Cutlass_EX