Heepo
Machine Learning | Large Language Models | NLP | Search | Recommendation
Beijing University of Posts and TelecommunicationsBeijing
Heepo's Stars
simoninithomas/Deep_reinforcement_learning_Course
Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch
tencent-ailab/persona-hub
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
openai/simple-evals
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
westlake-repl/Recommendation-Systems-without-Explicit-ID-Features-A-Literature-Review
Paper List of Pre-trained Foundation Recommender Models
LargeWorldModel/LWM
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
allenai/fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
EleutherAI/cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
WLiK/LLM4Rec-Awesome-Papers
A list of awesome papers and resources of recommender system on large language model (LLM).
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
karpathy/llm.c
LLM training in simple, raw C/CUDA
SafeAILab/EAGLE
Official Implementation of EAGLE-1 and EAGLE-2
declare-lab/instruct-eval
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
qinyiwei/InfoBench
openai/transformer-debugger
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
dora-rs/dora
DORA (Dataflow-Oriented Robotic Application) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dataflow capabilities. Applications are modeled as directed graphs, also referred to as pipelines.
OFA-Sys/InsTag
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
google/active-learning
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
NUS-HPC-AI-Lab/OpenDiT
OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
arcee-ai/mergekit
Tools for merging pretrained large language models.
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
PKU-Alignment/beavertails
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
allenai/reward-bench
RewardBench: the first evaluation tool for reward models.
ContextualAI/HALOs
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
abacusai/smaug
stanfordnlp/dspy
DSPy: The framework for programming—not prompting—foundation models