mmkamani7

Senior Research Scientist at AMD. Working on Efficient Gen AI, Federated Learning, Distributed Optimization, Model Compression, and Edge AI.

AMDBellevue, WA

mmkamani7's Stars

ggerganov/llama.cpp
LLM inference in C/C++
Language:C++62.3k 524 3.4k8.9k
tinygrad/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
Language:Python25.3k 267 6752.8k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python23.6k 218 3.6k3.4k
karpathy/llama2.c
Inference Llama 2 in one file of pure C
Language:C16.9k 191 2172k
apple/ml-stable-diffusion
Stable Diffusion with Core ML on Apple Silicon
Language:Python16.5k 142 236890
ml-explore/mlx
MLX: An array framework for Apple silicon
Language:C++15.9k 140 469905
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python15.2k 105 9711.5k
huggingface/candle
Minimalist ML framework for Rust
Language:Rust14.7k 147 628839
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python12.6k 117 9161.1k
state-spaces/mamba
Mamba SSM architecture
Language:Python11.9k 98 436989
nlpxucan/WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
Language:Python9.1k 112 189710
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
Language:Python7.5k 82 151739
abetlen/llama-cpp-python
Python bindings for llama.cpp
Language:Python7.3k 67 996867
EleutherAI/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Language:Python6.7k 122 433981
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Language:Python6.4k 60 78355
TimDettmers/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Language:Python5.8k 48 968584
Lightning-AI/lit-gpt
Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Language:Python5.2k 63 476540
CarperAI/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Language:Python4.4k 49 285470
mosaicml/llm-foundry
LLM training code for Databricks foundation models
Language:Python3.9k 49 368509
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
Language:Python3.4k 57 102256
srush/Tensor-Puzzles
Solve puzzles. Improve your pytorch.
Language:Jupyter Notebook2.9k 12 19241
huggingface/swift-coreml-diffusers
Swift app demonstrating Core ML Stable Diffusion
Language:Swift2.5k 39 62203
autodistill/autodistill
Images to inference with no labeling (use foundation models to train supervised models).
Language:Python1.7k 19 93134
tomaarsen/attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
Language:Python650 12 2941
jxzhangjhu/Awesome-LLM-Uncertainty-Reliability-Robustness
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
584 23 142
codalab/codalab-competitions
CodaLab Competitions
Language:Python502 45 2.6k130
m-bain/frozen-in-time
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
Language:Python339 11 4543
philipturner/metal-flash-attention
FlashAttention (Metal Port)
Language:Swift326 16 1214
google/fedjax
FedJAX is a JAX-based open source library for Federated Learning simulations that emphasizes ease-of-use in research.
Language:Python253 11 1741
menzHSE/cv-ml-lecture-notebooks
Computer Vision and Machine Learning Jupyter Notebooks for Educational Purposes
Language:Jupyter Notebook74 2 210

mmkamani7

mmkamani7's Stars

ggerganov/llama.cpp

tinygrad/tinygrad

vllm-project/vllm

karpathy/llama2.c

apple/ml-stable-diffusion

ml-explore/mlx

huggingface/peft

huggingface/candle

Dao-AILab/flash-attention

state-spaces/mamba

nlpxucan/WizardLM

Plachtaa/VALL-E-X

abetlen/llama-cpp-python

EleutherAI/gpt-neox

mit-han-lab/streaming-llm

TimDettmers/bitsandbytes

Lightning-AI/lit-gpt

CarperAI/trlx

mosaicml/llm-foundry

microsoft/LMOps

srush/Tensor-Puzzles

huggingface/swift-coreml-diffusers

autodistill/autodistill

tomaarsen/attention_sinks

jxzhangjhu/Awesome-LLM-Uncertainty-Reliability-Robustness

codalab/codalab-competitions

m-bain/frozen-in-time

philipturner/metal-flash-attention

google/fedjax

menzHSE/cv-ml-lecture-notebooks