crazyofapple's Stars
Re-Align/just-eval
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
HITsz-TMG/UMOE-Scaling-Unified-Multimodal-LLMs
The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
miso-belica/jusText
Heuristic based boilerplate removal tool
princeton-nlp/USACO
Can Language Models Solve Olympiad Programming?
XuezheMax/megalodon
Reference implementation of Megalodon 7B model
HITsz-TMG/ICL-State-Vector
thunlp/UltraChat
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
WJMacro/ContinualMT
A Continual Learning framework for Neural Machine Translation
tatsu-lab/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
voidism/DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
stanfordnlp/pyreft
ReFT: Representation Finetuning for Language Models
Hritikbansal/dove
stanfordnlp/string2string
String-to-String Algorithms for Natural Language Processing
hiyouga/LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
likenneth/othello_world
Emergent world representations: Exploring a sequence model trained on a synthetic task
tatsu-lab/alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
HKUNLP/icl-ceil
[ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.
xai-org/grok-1
Grok open release
jihoontack/MAC
Online Adaptation of Language Models with a Memory of Amortized Contexts
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
deeplearning-wisc/args
IBM/ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
UIC-Liu-Lab/CPT
[EMNLP 2022] Continual Training of Language Models for Few-Shot Learning
thunlp/ELLE
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
nathanhu0/CaMeLS
Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.
gmftbyGMFTBY/Rep-Dropout
[NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective
joeljang/continual-knowledge-learning
[ICLR 2022] Towards Continual Knowledge Learning of Language Models
EnnengYang/Awesome-Forgetting-in-Deep-Learning
A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning. arXiv:2307.09218.