StellaAthena
Democratizing language models and understanding how they work
Booz Allen Hamilton, EleutherAI
Pinned Repositories
gpt-neo
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
pythia
The hub for EleutherAI's work on interpretability and learning dynamics
the-pile
egnn-pytorch
Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch
fractal-ml
Fun stuff with fractal machine learning
gpt-neo
An implementation of model parallel GPT2& GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-tensorflow library.
OpenPrompt
An Open-Source Framework for Prompt-Learning.
starter-hugo-academic
transformer-memorization
StellaAthena's Repositories
StellaAthena/transformer-memorization
StellaAthena/OpenPrompt
An Open-Source Framework for Prompt-Learning.
StellaAthena/starter-hugo-academic
StellaAthena/lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
StellaAthena/mesh-transformer-jax
Model parallel transformers in JAX and Haiku
StellaAthena/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
StellaAthena/StellaAthena
GitHub README
StellaAthena/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
StellaAthena/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
StellaAthena/BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
StellaAthena/city-circuits
StellaAthena/client
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
StellaAthena/DeeperSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
StellaAthena/eleuther.ai
StellaAthena/huggingface.js
Utilities to use the Hugging Face Hub API
StellaAthena/llama
Inference code for LLaMA models
StellaAthena/magma
MAGMA - a GPT-style multimodal model that can understand any combination of images and language
StellaAthena/metaseq
Repo for external large-scale work
StellaAthena/ML_SageMaker_Studies
Case studies, examples, and exercises for learning to deploy ML models using AWS SageMaker.
StellaAthena/moss.rb
A plagiarism detection engine based on Stanford's MOSS(Measure of Software Similarity)
StellaAthena/mtg
Collection of data science and machine learning projects for Magic: the Gathering
StellaAthena/point-transformer-pytorch
Implementation of the Point Transformer layer, in Pytorch
StellaAthena/promptsource
Toolkit for collecting and applying templates of prompting instances
StellaAthena/speak-memory
Code and data to support "Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4"
StellaAthena/t-zero
Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
StellaAthena/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
StellaAthena/TopographicVAE
Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"
StellaAthena/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
StellaAthena/Why-Has-Predicting-Downstream-Capabilities-Remained-Elusive
Code for Preprint: Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
StellaAthena/women-tech-speakers-organizers
A list of women tech speakers & organizers. Add yourself or others by submitting a PR! PS if you do add someone, make sure to tell them! :) #fempire