zyxie

TwitterSan Francisco

zyxie's Stars

linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python2.9k145
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Language:Python52.4k10.5k
livekit/python-sdks
LiveKit real-time and server SDKs for Python
Language:Python10133
google/gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
Language:C++5.9k500
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python8.3k586
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda1.1k102
prometheus/prometheus
The Prometheus monitoring system and time series database.
Language:Go54.9k9k
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
Language:Jupyter Notebook11.6k1.6k
hidet-org/hidet
An open-source efficient deep learning framework/compiler, written in python.
Language:Python64652
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python36.2k5.7k
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python132k26.3k
huggingface/text-generation-inference
Large Language Model Text Generation Inference
Language:Python8.8k1k
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Language:Jupyter Notebook2.2k150
google/aqt
Language:Python24725
ptillet/triton-llvm-releases
Language:Shell2018
weaviate/weaviate
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
Language:Go10.8k740
karpathy/llama2.c
Inference Llama 2 in one file of pure C
Language:C17.2k2k
ggerganov/ggml
Tensor library for machine learning
Language:C++10.9k1k
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.8k882
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python34.7k4k
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Language:Python4.5k361
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python10k2.2k
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++5.4k904
ggerganov/llama.cpp
LLM inference in C/C++
Language:C++65k9.3k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.4k1.2k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python26.8k3.9k
twitter/the-algorithm
Source code for Twitter's Recommendation Algorithm
Language:Scala62.1k12.1k
dmlc/xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Language:C++26.1k8.7k
horovod/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Language:Python14.2k2.2k
apache/mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Language:C++20.8k6.8k

zyxie

zyxie's Stars

linkedin/Liger-Kernel

scrapy/scrapy

livekit/python-sdks

google/gemma.cpp

facebookresearch/xformers

flashinfer-ai/flashinfer

prometheus/prometheus

meta-llama/llama-recipes

hidet-org/hidet

karpathy/nanoGPT

huggingface/transformers

huggingface/text-generation-inference

FasterDecoding/Medusa

google/aqt

ptillet/triton-llvm-releases

weaviate/weaviate

karpathy/llama2.c

ggerganov/ggml

NVIDIA/FasterTransformer

microsoft/DeepSpeed

facebookincubator/AITemplate

NVIDIA/Megatron-LM

NVIDIA/cutlass

ggerganov/llama.cpp

Dao-AILab/flash-attention

vllm-project/vllm

twitter/the-algorithm

dmlc/xgboost

horovod/horovod

apache/mxnet