intfloat's Stars
meta-llama/llama
Inference code for Llama models
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
meta-llama/codellama
Inference code for CodeLlama models
Stability-AI/StableLM
StableLM: Stability AI Language Models
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
facebookresearch/ImageBind
ImageBind One Embedding Space to Bind Them All
openlm-research/open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
CarperAI/trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
mosaicml/llm-foundry
LLM training code for Databricks foundation models
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
THUDM/AgentBench
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
openai/prm800k
800,000 step-level correctness labels on LLM solutions to MATH problems
THUDM/WebGLM
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
guyulongcs/Awesome-Deep-Learning-Papers-for-Search-Recommendation-Advertising
Awesome Deep Learning papers for industrial Search, Recommendation and Advertisement. They focus on Embedding, Matching, Ranking (CTR/CVR prediction), Post Ranking, Large Model (Generative Recommendation, LLM), Transfer learning, Reinforcement Learning and so on.
CStanKonrad/long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
allenai/mmc4
MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
Victorwz/LongMem
Official implementation of our NeurIPS 2023 paper "Augmenting Language Models with Long-Term Memory".
jzbjyb/FLARE
Forward-Looking Active REtrieval-augmented generation (FLARE)
facebookresearch/Sphere
Web-scale retrieval for knowledge-intensive NLP
princeton-nlp/ALCE
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
LAION-AI/Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
huggingface/OBELICS
Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M documents, 115B text tokens and 353M images.
thu-coai/PICL
Code for ACL2023 paper: Pre-Training to Learn in Context