bfontain

DatabricksMountain View

Pinned Repositories

composer
Supercharge Your Model Training
Language:Python1 0 00
flash-attention
Fast and memory-efficient exact attention
Language:Python00
llm-foundry
LLM training code for MosaicML foundation models
Language:Python00
neuronx-distributed
Language:Python0 0 00
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0 00
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++9k 97 2.1k1k
models
Models and examples built with TensorFlow
Language:Python77.3k 2.7k 7.3k45.7k
tensorflow
An Open Source Machine Learning Framework for Everyone
Language:C++187k 7.5k 40.1k74.4k
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python32.7k 272 5.7k5k

bfontain's Repositories

bfontain/composer
Supercharge Your Model Training
Language:Python1 0 00
bfontain/flash-attention
Fast and memory-efficient exact attention
Language:Python00
bfontain/llm-foundry
LLM training code for MosaicML foundation models
Language:Python00
bfontain/neuronx-distributed
Language:Python0 0 00
bfontain/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0 00