Lauler's Stars
stanfordnlp/dspy
DSPy: The framework for programming—not prompting—language models
Lightning-AI/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
outlines-dev/outlines
Structured Text Generation
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
pytorch/torchtune
PyTorch native post-training library
mindee/doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
pytorch/torchtitan
A native PyTorch Library for large model training
unum-cloud/usearch
Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
unum-cloud/ucall
Web Serving and Remote Procedure Calls at 50x lower latency and 70x higher bandwidth than FastAPI, implementing JSON-RPC & REST over io_uring ☎️
xhluca/bm25s
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
huggingface/lighteval
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
oh-my-ocr/text_renderer
allenai/papermage
library supporting NLP and CV research on scientific papers
mlfoundations/open_lm
A repository for research on medium sized language models.
huggingface/cosmopedia
NVIDIA/JAX-Toolbox
JAX-Toolbox
huggingface/diarizers
rwitten/HighPerfLLMs2024
ymy-k/Hi-SAM
[TPAMI'24] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation
foundation-model-stack/fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
Kipok/NeMo-Skills
A pipeline to improve skills of large language models
ppaanngggg/layoutreader
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
zh460045050/VQGAN-LC
EleutherAI/improved-t5
Experiments for efforts to train a new and improved t5
AnswerDotAI/bert24
siyan-zhao/prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"
h4pZ/h4rch
My Arch dotfiles
nikvaessen/w2v2-batch-size
Code for paper "The effect of batch size on contrastive self-supervised speech representation learning"
swiss-ai/nanotron
Minimalistic large language model 3D-parallelism training
swerik-project/the-swedish-parliament-corpus
A repository for managing public, versioned releases of the Swedish Parliament Corpus.