jalola

jalola's Stars

codecrafters-io/build-your-own-x
Master programming by recreating your favorite technologies from scratch.
Language:Markdown313k 5.5k 69129k
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Python71.7k 576 08.5k
meta-llama/llama
Inference code for Llama models
Language:Python56.5k 527 9939.6k
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python37.5k 377 3186k
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda24.5k 246 1412.8k
karpathy/llama2.c
Inference Llama 2 in one file of pure C
Language:C17.5k 192 2222.1k
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python16.5k 111 1.1k1.6k
meta-llama/codellama
Inference code for CodeLlama models
Language:Python16k 187 1961.9k
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Language:C++14.8k 247 6.7k2.9k
visenger/awesome-mlops
A curated list of references for MLOps
12.6k 397 161.9k
huggingface/text-generation-inference
Large Language Model Text Generation Inference
Language:Python9.1k 103 1.4k1.1k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.7k 94 2k996
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.9k 62 625894
allegroai/clearml
ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
Language:Python5.7k 97 1.1k656
daquexian/onnx-simplifier
Simplify your onnx model
Language:C++3.9k 51 307383
karpathy/ng-video-lecture
Language:Python3.6k 57 30933
facebookresearch/esm
Evolutionary Scale Modeling (esm): Pretrained language models for proteins
Language:Python3.3k 65 320643
meta-llama/PurpleLlama
Set of tools to assess and improve LLM security.
Language:Python2.7k 48 37452
GoogleCloudPlatform/ml-design-patterns
Source code accompanying O'Reilly book: Machine Learning Design Patterns
Language:Jupyter Notebook1.9k 50 28533
ELS-RD/transformer-deploy
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
Language:Python1.7k 27 121151
onnx/onnxmltools
ONNXMLTools enables conversion of models to ONNX
Language:Python1k 44 293184
triton-inference-server/tensorrtllm_backend
The Triton TensorRT-LLM Backend
Language:Python710 24 496107
Azure/MS-AMP
Microsoft Automatic Mixed Precision Library
Language:Python525 11 6543
triton-inference-server/model_analyzer
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
Language:Python434 14 16275
Snowflake-Labs/sfquickstarts
Follow along with our tutorials to get you up and running with Snowflake.
Language:Jupyter Notebook350 20 442656
microsoft/onnxruntime-extensions
onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime
Language:C++340 28 16391
ramonhagenaars/jsons
🐍 A Python lib for (de)serializing Python objects to/from JSON
Language:Python288 8 12740
microsoft/DigiFace1M
277 9 025
premAI-io/benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
Language:Shell134 12 1075
kamyu104/GoogleKickStart-2022
🏃 Python3 Solutions of All 32 Problems in GKS 2022
Language:Python44 4 115