inferentia

There are 10 repositories under inferentia topic.

vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python58.2k 437 10.8k10.1k
aphrodite-engine/aphrodite-engine
Large-scale LLM inference engine
Language:C++1.5k 23 238167
aws-samples/foundation-model-benchmarking-tool
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
Language:Jupyter Notebook251 8 6743
aws-solutions-library-samples/guidance-for-machine-learning-inference-on-aws
This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It addresses the basic implementation requirements as well as ways you can pack thousands of unique PyTorch deep learning (DL) models into a scalable architecture and evaluate performance
Language:Shell46 24 514
aws-samples/aws-inferentia-huggingface-workshop
CMP314 Optimizing NLP models with Amazon EC2 Inf1 instances in Amazon Sagemaker
Language:Jupyter Notebook13 2 14
aws-samples/awsome-fmops
Collection of bet practices, reference architectures, examples, and utilities for foundation model development and deployment on AWS.
13 5 24
daekeun-ml/aws-inferentia
This repository provides an easy hands-on way to get started with AWS Inferentia. A demonstration of this hands-on can be seen in the AWS Innovate 2023 - AIML Edition session.
Language:Jupyter Notebook7 1 01
DarkSector/inf1-sentence-transformers
Sentence Transformers on EC2 Inf1
Language:Jupyter Notebook1 1 0
windson/inferentia-deployments
Deploy Large Models on AWS Inferentia (Inf2) instances.
Language:Jupyter Notebook2 0
yahavb/coldstart-recs-on-aws-trainium
End-to-end solution for cold-start recommendations using vLLM, DeepSeek Llama (8B & 70B), and FAISS on AWS Trainium (Trn1) with the Neuron SDK and NeuronX Distributed. Includes LLM-based interest expansion, embedding comparisons (T5 & SentenceTransformers), and scalable retrieval workflows.
Language:Python

inferentia

vllm-project/vllm

aphrodite-engine/aphrodite-engine

aws-samples/foundation-model-benchmarking-tool

aws-solutions-library-samples/guidance-for-machine-learning-inference-on-aws

aws-samples/aws-inferentia-huggingface-workshop

aws-samples/awsome-fmops

daekeun-ml/aws-inferentia

DarkSector/inf1-sentence-transformers

windson/inferentia-deployments

yahavb/coldstart-recs-on-aws-trainium