Pinned Repositories
alpa
Training and serving large-scale neural networks with auto parallelization.
bairblog.github.io
BigLittleDecoder
[NeurIPS'23] Speculative Decoding with Big Little Decoder
ColossalAI
Making big AI models cheaper, easier, and more scalable
DiskANN
Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search
dspy
DSPy: The framework for programming—not prompting—foundation models
Megatron-LM-Benchmarks
Benchmarks of NVIDIA's Megatron-LM
TinyAgent
TinyAgent: Function Calling at the Edge!
lotus
LOTUS: The semantic query engine - process data with LMs as easily as writing pandas code
TAG-Bench
TAG: Table-Augmented Generation
sidjha1's Repositories
sidjha1/Megatron-LM-Benchmarks
Benchmarks of NVIDIA's Megatron-LM
sidjha1/alpa
Training and serving large-scale neural networks with auto parallelization.
sidjha1/bairblog.github.io
sidjha1/BigLittleDecoder
[NeurIPS'23] Speculative Decoding with Big Little Decoder
sidjha1/ColossalAI
Making big AI models cheaper, easier, and more scalable
sidjha1/DiskANN
Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search
sidjha1/dspy
DSPy: The framework for programming—not prompting—foundation models
sidjha1/lmql
A programming language for large language models.
sidjha1/TinyAgent
TinyAgent: Function Calling at the Edge!
sidjha1/NeuralDB
Database Reasoning Over Text project for ACL paper
sidjha1/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
sidjha1/SqueezeLLM
SqueezeLLM: Dense-and-Sparse Quantization
sidjha1/SqueezeLLM-gradients
sidjha1/tensorflow-alpa
sidjha1/zero_scrolls
Running inference on the ZeroSCROLLS benchmark