ilyalasy's Stars
karpathy/llm.c
LLM training in simple, raw C/CUDA
teacherpeterpan/Logic-LLM
The project page for "LOGIC-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning"
IBM/ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
EleutherAI/elk
Keeping language models honest by directly eliciting knowledge encoded in their activations.
princeton-nlp/SWE-agent
[NeurIPS 2024] SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
noamgat/lm-format-enforcer
Enforce the output format (JSON Schema, Regex etc) of a language model
SakanaAI/evolutionary-model-merge
Official repository of Evolutionary Optimization of Model Merging Recipes
lucidrains/soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
IST-DASLab/qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
sergree/matchering
🎚️ Open Source Audio Matching and Mastering
mlfoundations/task_vectors
Editing Models with Task Arithmetic
arcee-ai/mergekit
Tools for merging pretrained large language models.
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
lucidrains/st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
ilyalasy/moe-routing
Analysis of token routing for different implementations of Mixture of Experts
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
koxudaxi/datamodel-code-generator
Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
XueFuzhao/OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
THU-KEG/KEPLER
Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".
facebookresearch/LAMA
LAnguage Model Analysis
hadasah/btm
deepseek-ai/DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
OFA-Sys/Ditto
A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment".
dojoteef/storium-gpt2
Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"
fabrahman/char-centric-story
Codebase for character-centric story understanding
BerriAI/litellm
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
ianarawjo/ChainForge
An open-source visual programming environment for battle-testing prompts to LLMs.
c32168/dyntamic
Generate pydantic models from JSON Schema