kashif
Principal Research Scientist working on Deep Learning, Time Series Forecasting, Reinforcement Learning and HPC.
Berlin, Germany
kashif's Stars
argmaxinc/WhisperKit
Swift native on-device speech recognition with Whisper for Apple Silicon
huggingface/text-embeddings-inference
A blazing fast inference solution for text embeddings models
amazon-science/chronos-forecasting
Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
FinanceData/FinanceDataReader
Financial data reader
jmtomczak/intro_dgm
"Deep Generative Modeling": Introductory Examples
huggingface/nanotron
Minimalistic large language model 3D-parallelism training
stanfordnlp/pyreft
ReFT: Representation Finetuning for Language Models
uclaml/SPIN
The official implementation of Self-Play Fine-Tuning (SPIN)
Efficient-Large-Model/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
urchade/GLiNER
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
SalesforceAIResearch/uni2ts
Unified Training of Universal Time Series Forecasting Transformers
facebookresearch/generative-recommenders
Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152, ICML'24).
xfactlab/orpo
Official repository for ORPO
allenai/reward-bench
RewardBench: the first evaluation tool for reward models.
louaaron/Score-Entropy-Discrete-Diffusion
[ICML 2024 Oral] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
spcl/QuaRot
Code for QuaRot, an end-to-end 4-bit inference of large language models.
AmeenAli/HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
SalesforceAIResearch/DiffusionDPO
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
ESA-PhiLab/Major-TOM
Expandable Datasets for Earth Observation
felipemaiapolo/tinyBenchmarks
Evaluating LLMs with fewer examples
robertvacareanu/llm4regression
Examining how large language models (LLMs) perform across various synthetic regression tasks when given (input, output) examples in their context, without any parameter update
apple/ml-4m
4M: Massively Multimodal Masked Modeling (NeurIPS 2023 Spotlight)
vwxyzjn/summarize_from_feedback_details
zhaoyu-li/DL4TP
A Survey on Deep Learning for Theorem Proving
google/codex
Data compression in JAX
Asap7772/understanding-rlhf
ZhaolinGao/REBEL
fbarez/Interpreting-Context-Look-ups
hohe12ly/lag-llama
Shawn-Guo-CN/Alignment_with_Huggingface