Pinned Repositories
algorithm-distillation-from-conversations
Algorithm Distillation + Pretraining Language Models with Human Preferences + Chat
attention_with_linear_biases
bitsandbytes
8-bit CUDA functions for PyTorch
ChatCombinedDatahandling
codeclippy_postprocessing
https://github.com/huggingface/transformers/blob/main/examples/research_projects/codeparrot/scripts but edited to do just one thing
Compact-Transformers
[Preprint] Escaping the Big Data Paradigm with Compact Transformers, 2021
grimoire-exploration
Diving into LLMs like they're a grimoire
llama
Inference code for LLaMA models
lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
pulse
PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
dmahan93's Repositories
dmahan93/lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
dmahan93/llama
Inference code for LLaMA models
dmahan93/grimoire-exploration
Diving into LLMs like they're a grimoire
dmahan93/ChatCombinedDatahandling
dmahan93/pulse
PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
dmahan93/algorithm-distillation-from-conversations
Algorithm Distillation + Pretraining Language Models with Human Preferences + Chat
dmahan93/attention_with_linear_biases
dmahan93/bitsandbytes
8-bit CUDA functions for PyTorch
dmahan93/codeclippy_postprocessing
https://github.com/huggingface/transformers/blob/main/examples/research_projects/codeparrot/scripts but edited to do just one thing
dmahan93/Compact-Transformers
[Preprint] Escaping the Big Data Paradigm with Compact Transformers, 2021
dmahan93/ELM
Evolution Through Large Models Implementation
dmahan93/mesh-transformer-jax
Model parallel transformers in JAX and Haiku
dmahan93/pyIesorPhysics
Python version of IeSOR
dmahan93/reward-modeling
dmahan93/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
dmahan93/token-shift-gpt
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
dmahan93/decontamination
This repository contains code for cleaning your training data of benchmark data to help combat data snooping.
dmahan93/emoggoth
Generate your favorite emoji shoggoth!
dmahan93/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
dmahan93/QDSyntheticData
dmahan93/RGB
dmahan93/sft
dmahan93/tclx
A repository for transformer critique learning and generation
dmahan93/tinypar
dmahan93/toolformer
dmahan93/TransformerLengthExtension
dmahan93/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
dmahan93/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs