mmkamani7
Senior Research Scientist at AMD. Working on Efficient Gen AI, Federated Learning, Distributed Optimization, Model Compression, and Edge AI.
AMDBellevue, WA
mmkamani7's Stars
ggerganov/llama.cpp
LLM inference in C/C++
tinygrad/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
karpathy/llama2.c
Inference Llama 2 in one file of pure C
apple/ml-stable-diffusion
Stable Diffusion with Core ML on Apple Silicon
ml-explore/mlx
MLX: An array framework for Apple silicon
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
huggingface/candle
Minimalist ML framework for Rust
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
state-spaces/mamba
Mamba SSM architecture
nlpxucan/WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
abetlen/llama-cpp-python
Python bindings for llama.cpp
EleutherAI/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
TimDettmers/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Lightning-AI/lit-gpt
Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
CarperAI/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
mosaicml/llm-foundry
LLM training code for Databricks foundation models
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
srush/Tensor-Puzzles
Solve puzzles. Improve your pytorch.
huggingface/swift-coreml-diffusers
Swift app demonstrating Core ML Stable Diffusion
autodistill/autodistill
Images to inference with no labeling (use foundation models to train supervised models).
tomaarsen/attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
jxzhangjhu/Awesome-LLM-Uncertainty-Reliability-Robustness
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
codalab/codalab-competitions
CodaLab Competitions
m-bain/frozen-in-time
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
philipturner/metal-flash-attention
FlashAttention (Metal Port)
google/fedjax
FedJAX is a JAX-based open source library for Federated Learning simulations that emphasizes ease-of-use in research.
menzHSE/cv-ml-lecture-notebooks
Computer Vision and Machine Learning Jupyter Notebooks for Educational Purposes