petrex

Accelerating Generative AI/ LLM

Mountain View, California

Pinned Repositories

tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python11.8k 376 3.4k3.5k
gloo
Collective communications library with various primitives for multi-machine training.
Language:C++1.2k 61 117303
3D-ResNets-PyTorch
3D ResNets for Action Recognition (CVPR 2018)
Language:Python0 1 00
AlfredWorkflow.com
A public Collection of Alfred Workflows.
Language:Python0 1 00
alphatensor
Language:Python0 0 00
FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++0 0 00
flash-attention
Fast and memory-efficient exact attention
Language:Python0 0 00
ao
PyTorch native quantization and sparsity for training and inference
Language:Python1.6k 41 299177
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python84.4k 1.7k 46.8k22.7k

petrex's Repositories

petrex/Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
Language:Python0 0 00
petrex/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++0 0 00
petrex/flash-attention
Fast and memory-efficient exact attention
Language:Python0 0 00
petrex/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python0 3 00
petrex/triton
Development repository for the Triton language and compiler
Language:C++0 0 00
petrex/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python0 1 00
petrex/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0 00
petrex/ao
torchao: PyTorch Architecture Optimization (AO). A repository to host AO techniques and performant kernels that work with PyTorch.
Language:Python
petrex/aotriton
Ahead of Time (AOT) Triton Math Library
Language:Python
petrex/ChatDev
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
Language:Shell0 0
petrex/composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
Language:C++0 0
petrex/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python0 0
petrex/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Language:Python0 0
petrex/gpt-researcher
GPT based autonomous agent that does online comprehensive research on any given topic
Language:Python0 0
petrex/human-eval
Code for the paper "Evaluating Large Language Models Trained on Code"
Language:Python0 0
petrex/jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Language:Python0 0
petrex/llama
Inference code for LLaMA models
Language:Python0 0
petrex/llama-recipes
Examples and recipes for Llama 2 model
Language:Python0 0
petrex/llama.cpp
Port of Facebook's LLaMA model in C/C++
petrex/llm.c
LLM training in simple, raw C/CUDA
petrex/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python0 0
petrex/onnx
Open Neural Network Exchange
Language:Python1 0
petrex/OpenChatKit
Language:Python0 0
petrex/sglang
SGLang is a fast serving framework for large language models and vision language models.
petrex/superblock
A block oriented training approach for inference time optimization.
Language:Python
petrex/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++0 0
petrex/torchchat
Run PyTorch LLMs locally on servers, desktop and mobile
petrex/torchtitan
A native PyTorch Library for large model training
petrex/torchtune
A Native-PyTorch Library for LLM Fine-tuning
Language:Python0 0
petrex/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python0 0