Pinned Repositories
tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
gloo
Collective communications library with various primitives for multi-machine training.
3D-ResNets-PyTorch
3D ResNets for Action Recognition (CVPR 2018)
AlfredWorkflow.com
A public Collection of Alfred Workflows.
alphatensor
FasterTransformer
Transformer related optimization, including BERT, GPT
flash-attention
Fast and memory-efficient exact attention
ao
PyTorch native quantization and sparsity for training and inference
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
petrex's Repositories
petrex/Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
petrex/FasterTransformer
Transformer related optimization, including BERT, GPT
petrex/flash-attention
Fast and memory-efficient exact attention
petrex/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
petrex/triton
Development repository for the Triton language and compiler
petrex/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
petrex/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
petrex/ao
torchao: PyTorch Architecture Optimization (AO). A repository to host AO techniques and performant kernels that work with PyTorch.
petrex/aotriton
Ahead of Time (AOT) Triton Math Library
petrex/ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
petrex/composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
petrex/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
petrex/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
petrex/gpt-researcher
GPT based autonomous agent that does online comprehensive research on any given topic
petrex/human-eval
Code for the paper "Evaluating Large Language Models Trained on Code"
petrex/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
petrex/llama
Inference code for LLaMA models
petrex/llama-recipes
Examples and recipes for Llama 2 model
petrex/llama.cpp
Port of Facebook's LLaMA model in C/C++
petrex/llm.c
LLM training in simple, raw C/CUDA
petrex/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
petrex/onnx
Open Neural Network Exchange
petrex/OpenChatKit
petrex/sglang
SGLang is a fast serving framework for large language models and vision language models.
petrex/superblock
A block oriented training approach for inference time optimization.
petrex/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
petrex/torchchat
Run PyTorch LLMs locally on servers, desktop and mobile
petrex/torchtitan
A native PyTorch Library for large model training
petrex/torchtune
A Native-PyTorch Library for LLM Fine-tuning
petrex/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.