如果LLM的突然到来让你感到沮丧,不妨读下主目录的Choose Your Weapon Survival Strategies for Depressed AI Academics 持续更新以下内容,Star to keep updated~
- 开源LLM
- 指令微调和RLHF数据以及训练框架
- Prompt和LLM相关论文按细分方向梳理
- AIGC相关应用
- Prompt指南和教程
- ChatGPT及AGI相关解读
- 解密Prompt系列1. Tunning-Free Prompt:GPT2 & GPT3 & LAMA & AutoPrompt
- 解密Prompt系列2. 冻结Prompt微调LM: T5 & PET & LM-BFF
- 解密Prompt系列3. 冻结LM微调Prompt: Prefix-tuning & Prompt-tuning & P-tuning
- 解密Prompt系列4. 升级Instruction Tuning:Flan/T0/InstructGPT/TKInstruct
- 解密prompt系列5. APE+SELF=自动化指令集构建代码实现
- 解密Prompt系列6. lora指令微调扣细节-请冷静,1个小时真不够~
- 解密Prompt系列7. 偏好对齐RLHF-OpenAI·DeepMind·Anthropic对比分析
- 解密Prompt系列8. 无需训练让LLM支持超长输入:知识库 & Unlimiformer & PCW & NBCE
- 解密Prompt系列9. 模型复杂推理-思维链基础和进阶玩法
- ChatGPT应用1. MakeInstruction零人工指令样本构建
- ChatGPT应用2. ChatPDF简单复现
- 可商用LLM列表
- CMU开源聊天机器人评测应用: ChatGPT>Vicuna>others;在对话场景中训练可能很重要
- Berkley出品大模型排位赛榜有准中文榜单: GPT4自然是稳居第一,GPT4>Claude>GPT3.5>Vicuna>others
- Z-Bench中文真格基金评测: 国产中文模型的编程可用性还相对较低,大家水平差不太多,两版ChatGLM提升明显
- Open LLM Leaderboard: 在Eleuther AI4个评估集上评估的LLM模型榜单
- Chain-of-thought评估:GSM8k, MATH等复杂问题排行榜
模型链接 | 模型描述 |
---|---|
Google Bard | 谷歌bard虽迟但到,可以申请waitlist了 |
Claude | ChatGPT最大竞争对手Claude也开放申请了,slack中无限试用 |
Falcon | Falcon由阿联酋技术研究所在超高质量1万亿Token上训练得到1B,7B,40B开源,免费商用!土豪们表示钱什么的格局小了 |
LLaMA | Meta开源指令微调LLM,规模70 亿到 650 亿不等 |
MPT | MosaicML开源的预训练+指令微调的新模型,可商用,支持84k tokens超长输入 |
RedPajama | RedPajama项目既开源预训练数据后开源3B,7B的预训练+指令微调模型 |
ChatLLaMA | 基于RLHF微调了LLaMA |
Alpaca | 斯坦福开源的使用52k数据在7B的LLaMA上微调得到, |
Alpaca-lora | LORA微调的LLaMA |
Dromedary | IBM self-aligned model with the LLaMA base |
Vicuna | Alpaca前成员等开源以LLama13B为基础使用ShareGPT指令微调的模型,提出了用GPT4来评测模型效果 |
koala | 使用alpaca,HC3等开源指令集+ ShareGPT等ChatGPT数据微调llama,在榜单上排名较高 |
ColossalChat | HPC-AI Tech开源的Llama+RLHF微调 |
MiniGPT4 | Vicuna+BLIP2 文本视觉融合 |
StackLLama | LLama使用Stackexchange数据+SFT+RL |
Cerebras | Cerebras开源了1亿到130亿的7个模型,从预训练数据到参数全开源 |
PaLM-E | 谷歌多模态大模型,540B的PaLM语言模型和22B的ViT视觉模型相结合,得到562B的PaLM-E模型,在机器人应用场景有了新的突破 |
Dolly-v2 | 可商用 7b指令微调开源模型在GPT-J-6B上微调 |
OpenChatKit | openai研究员打造GPT-NoX-20B微调+6B审核模型过滤 |
MetaLM | 微软开源的大规模自监督预训练模型 |
Amazon Titan | 亚马逊在aws上增加自家大模型 |
OPT-IML | Meta复刻GPT3,up to 175B, 不过效果并不及GPT3 |
Bloom | BigScience出品,规模最大176B |
BloomZ | BigScience出品, 基于Bloom微调 |
Galacia | 和Bloom相似,更针对科研领域训练的模型 |
T0 | BigScience出品,3B~11B的在T5进行指令微调的模型 |
模型链接 | 模型描述 |
---|---|
ChatGLM | 清华开源的、支持中英双语的对话语言模型,使用了代码训练,指令微调和RLHF。和以下GLM相同大小的130B的模型还在开发中。试用了下超出预期! |
Moss | 为复旦正名!开源了预训练,指令微调的全部数据和模型。可商用 |
Wombat-7B | 达摩院开源无需强化学习使用RRHF对齐的语言模型, alpaca基座 |
TigerBot | 虎博开源了7B 180B的模型以及预训练和微调语料 |
Chinese-LLaMA-Alpaca | 哈工大中文指令微调的LLaMA |
Luotuo | 中文指令微调的LLaMA,和ChatGLM |
文心一言 | 已经拿到邀请码并试用,虽然人格化程度显著低,但效果上并没有很拉胯,国产YYDS!不过商业化霸王条款确实不少 |
通义千问 | 阿里系LLM开放申请 |
星火 | 科大讯飞星火,数学是真的厉害 |
Aquila | 智源开源7B大模型可商用免费 |
Baichuan | 百川智能开源7B大模型可商用免费 |
BiLLa | LLama词表扩充预训练+预训练和任务1比1混合SFT+指令样本SFT三阶段训练 |
Phoenix | 港中文开源凤凰和奇美拉LLM,Bloom基座,40+语言支持 |
OpenBuddy | Llama 多语言对话微调模型 |
Guanaco | LLama 7B基座,在alpaca52K数据上加入534K多语言指令数据微调 |
ziya | IDEA研究院在7B/13B llama上继续预训练+SFT+RM+PPO+HFTT+COHFT+RBRS |
Chinese Vincuna | LLama 7B基座,使用Belle+Guanaco数据训练 |
Linly | Llama 7B基座,使用belle+guanaco+pclue+firefly+CSL+newscommentary等7个指令微调数据集训练 |
Firefly | 中文2.6B模型,提升模型中文写作,古文能力,待开源全部训练代码,当前只有模型 |
Baize | 使用100k self-chat对话数据微调的LLama |
BELLE | 使用ChatGPT生成数据对开源模型进行中文优化 |
Chatyuan | chatgpt出来后最早的国内开源对话模型,T5架构是下面PromptCLUE的衍生模型 |
PromptCLUE | 多任务Prompt语言模型 |
PLUG | 阿里达摩院发布的大模型,提交申请会给下载链接 |
CPM2.0 | 智源发布CPM2.0 |
GLM | 清华发布的中英双语130B预训练模型 |
模型链接 | 模型描述 |
---|---|
MedPalm | Google在Faln-PaLM的基础上通过多种类型的医疗QA数据进行prompt-tuning指令微调得到,同时构建了MultiMedQA |
ChatDoctor | 110K真实医患对话样本+5KChatGPT生成数据进行指令微调 |
Huatuo Med-ChatGLM | 医学知识图谱和chatgpt构建中文医学指令数据集+医学文献和chatgpt构建多轮问答数据 |
Chinese-vicuna-med | Chinese-vicuna在cMedQA2数据上微调 |
OpenBioMed | 清华AIR开源轻量版BioMedGPT, 知识图谱&20+生物研究领域多模态预训练模型 |
DoctorGLM | ChatDoctor+MedDialog+CMD 多轮对话+单轮指令样本微调GLM |
MedicalGPT-zh | 自建的医学数据库ChatGPT生成QA+16个情境下SELF构建情景对话 |
PMC-LLaMA | 医疗论文微调Llama |
NHS-LLM | Chatgpt生成的医疗问答,对话,微调模型 |
LawGPT-zh | 利用ChatGPT清洗CrimeKgAssitant数据集得到52k单轮问答+我们根据中华人民共和国法律手册上最核心的9k法律条文,利用ChatGPT联想生成具体的情景问答+知识问答使用ChatGPT基于文本构建QA对 |
LawGPT | 基于llama+扩充词表二次预训练+基于法律条款构建QA指令微调 |
Lawyer Llama | 法律指令微调数据集:咨询+法律考试+对话进行指令微调 |
LexiLaw | 法律指令微调数据集:问答+书籍概念解释,法条内容进行指令微调 |
FinChat.io | 使用最新的财务数据,电话会议记录,季度和年度报告,投资书籍等进行训练 |
OpenGPT | 领域LLM指令样本生成+微调框架 |
乾元BigBang金融2亿模型 | 金融领域预训练+任务微调 |
度小满千亿金融大模型 | 在Bloom-176B的基础上进行金融+中文预训练和微调 |
bondGPT | GPT4在细分债券市场的应用开放申请中 |
IndexGPT | JPMorgan在研的生成式投资顾问 |
工具描述 | 链接 |
---|---|
langchain:LLM工具集 | https://github.com/hwchase17/langchain |
BMTTools: 清华出品类似langchain | https://github.com/OpenBMB/BMTools |
BabyAGI:自执行LLM Agent | https://github.com/yoheinakajima/babyagi |
AutoGPT:自执行LLM Agent | https://github.com/Torantulino/Auto-GPT |
Jarvis: 大模型调用小模型框架,给小模型一个未来! | https://github.com/search?q=jarvis |
LLM-ToolMaker:让LLM自己制造Agent | https://github.com/FMInference/FlexGen |
Gorilla: LLM调用大量API | https://github.com/ShishirPatil/gorilla |
wenda:闻达小模型整合搜索用于知识融入 | https://github.com/l15y/wenda |
WorkGPT:类似AutoGPT | https://github.com/team-openpm/workgpt |
Deep-KE:基于LLM对数据进行智能解析实现知识抽取 | https://github.com/zjunlp/DeepKE |
- OpenAI Cookbook: 提供OpenAI模型使用示例 ⭐
- OpenAI 接口被墙解决办法: 使用腾讯云搭建代理,亲测非常好用且手残党也可以轻松上手
- PromptPerfect:用魔法打败魔法,输入原始提示词,模型进行定向优化,试用后我有点沉默了,可以定向支持不同使用prompt的模型如Difussion,ChatGPT, Dalle等
- ClickPrompt: 为各种prompt加持的工具生成指令包括Difussion,chatgptdeng, 需要OpenAI Key
- ChatGPT ShortCut:提供各式场景下的Prompt范例,范例很全,使用后可以点赞! ⭐
- Full ChatGPT Prompts + Resources: 各种尝尽的prompt范例,和以上场景有所不同
- learning Prompt: prompt engineering超全教程,和落地应用收藏,包括很多LLM调用Agent的高级场景 ⭐
- The art of asking chatgpt for high quality answers: 如何写Prompt指令出书了,链接是中文翻译的版本,比较偏基础使用
- Prompt-Engineer-Guide: 同learnig prompt类的集成教程,互相引用可还行?!分类索引做的更好些 ⭐
- OpenAI 应用汇总指南: 纯应用类的汇总指南
- AI 导航: 包括但不限于ChatGPT的应用汇总网站,更新很快,发现了一些新大陆
- AI Alignment Forum: RLHF等对齐相关最新论文和观点的讨论论坛
- cognosys: 全网最火的web端AutoGPT,不过咋说呢试用了下感觉下巴要笑掉了,不剧透去试试你就知道
- godmode:需要人为每一步交互的的AutoGPT
- agentgpt: 基础AutoGPT
- New Bing:需要连外网否则会重定向到bing**,需要申请waitlist ⭐
- Perplexity.ai: 同样需要科学上网,感觉比Bing做的更好的接入ChatGPT的神奇搜索引擎,在Bing之外还加入了相关推荐和追问 ⭐
- BingGPT: NewBing开源桌面客户端,可以将聊天记录导出
- DocsGPT: 把ChatGPT开放域问答转化成封闭域问答的通用方案,试用垂类领域问答场景,可以试用定制的ChatBot ⭐
- langchain-ChatGLM: 基于ChatGLM的本地知识问答,和上面的DocsGPT相似,不过可以本地部署:star:
- ChatPDF: 国内的ChatPDF, 上传pdf后,会给出文章的Top5可能问题,然后对话式从文档中进行问答和检索,10s读3万字
- ChatDoc:ChatPDF升级版,增加了表格类解析,和完善的索引引用加跳转加对应文章内容高亮,哈哈我准备自己整一个
- ChatPaper: 根据输入关键词,自动在arxiv上下载最新的论文,并对论文进行摘要总结,可以在huggingface上试用!
- OpenRead: 面向论文写作,阅读场景,可以帮助生成文献综述,以及提供和NotionAI相似的智能Markdown用于写作
- researchgpt: 和ChatPDF类似,支持arivx论文下载,加载后对话式获取论文重点
- BriefGPT: 日更Arxiv论文,并对论文进行摘要,关键词抽取,帮助研究者了解最新动态, UI不错哟
- ChatGPT-academic: 又是一个基于gradio实现的paper润色,摘要等功能打包的实现
- feishu-chatgpt: 飞书chatgpt,和365copilot相似也是多组件集成, 有点全!
- ChatMind: chatgpt生成思维导图,针对话题的生成还可以,但是针对某本书的就是瞎编了,但是感觉和检索式阅读方式结合效果会出彩~
- Shell: 基于ChatGPT的AI英语聊天工具,口语练习助手
- AI Topiah: 聆心智能AI角色聊天,和路飞唠了两句,多少有点中二之魂在燃烧
- chatbase: 情感角色聊天,还没尝试
- Vana: virtual DNA, 通过聊天创建虚拟自己!概念很炫
- WriteSonic:AI写作,支持对话和定向创作如广告文案,商品描述, 支持Web检索是亮点,支持中文
- copy.ai: WriteSonic竞品,亮点是像论文引用一样每句话都有对应网站链接,可以一键复制到右边的创作Markdown,超级好用! ⭐
- NotionAI:智能Markdown,适用真相!在创作中用command调用AI辅助润色,扩写,检索内容,给创意idea
- Jasper: 同上,全是竞品哈哈
- copy.down: 中文的营销文案生成,只能定向创作,支持关键词到文案的生成
- ChatExcel: 指令控制excel计算,对熟悉excel的有些鸡肋,对不熟悉的有点用
- ChatPPT: 使用ChatGPT进行PPT制作
- BibiGPT: Bilibli视频内容一键总结,多模态文档
- Microsoft 365 Copilot:微软Office全面接入GPT4,智能PPT,Excel,Word,暂无链接。其实就是上面开源创意的全家桶套餐
- Google Workspace: 谷歌推出的搭载各种AI服务的办公场景全覆盖,暂无使用方案。
- Copilot: 要付费哟
- Fauxpilot: copilot本地开源替代
- CodeGex: 国内替代品,还没试过
- Codeium: Copilot替代品,有免费版本支持各种plugin
- Wolverine: 代码自我debug的python脚本
- dreamstudio.ai: 开创者,Stable Difussion, 有试用quota
- midjourney: 开创者,艺术风格为主
- Dall.E: 三巨头这就凑齐了
- ControlNet: 为绘画创作加持可控性
- GFPGAN: 照片修复
- Visual ChatGPT: 微软发布图像ChatGPT,对话方式进行图像生成编辑,问答 ⭐
- gemo.ai: 多模态聊天机器人,包括文本,图像,视频生成
- OpenAI ChatGPT Intro
- OpenAI InstructGPT intro
- AllenAI ChatGPT能力解读:How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources ⭐
- Huggingface ChatGPT能力解读:The techniques behind ChatGPT: RLHF, IFT, CoT, Red teaming, and more
- Stephen Wolfram ChatGPT能力解读: What Is ChatGPT Doing and Why Does It Work?
- Chatgpt相关解读汇总
- 麻省理工科技采访OpenAI工程师
- AGI历史与现状
- 张俊林 通向AGI之路:大型语言模型(LLM)技术精要
- 知乎回答 OpenAI 发布 GPT-4,有哪些技术上的优化或突破?
- 追赶ChatGPT的难点与平替
- 压缩即泛化,泛化即智能
- 陆奇最新演讲实录:我的大模型世界观|第十四期
- https://github.com/dongguanting/In-Context-Learning_PaperList
- https://github.com/thunlp/PromptPapers
- https://github.com/Timothyxxx/Chain-of-ThoughtsPapers
- A Survey of Large Language Models
- Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing ⭐
- Paradigm Shift in Natural Language Processing
- Pre-Trained Models: Past, Present and Future
- What Language Model Architecture and Pretraining objects work best for zero shot generalization ⭐
- Towards Reasoning in Large Language Models: A Survey
- Reasoning with Language Model Prompting: A Survey ⭐
- An Overview on Language Models: Recent Developments and Outlook ⭐
- LARGER LANGUAGE MODELS DO IN-CONTEXT LEARNING DIFFERENTLY
- Evidence of Meaning in Language Models Trained on Programs
- Sparks of Artificial General Intelligence: Early experiments with GPT-4
- How does in-context learning work? A framework for understanding the differences from traditional supervised learning
- Why can GPT learn in-context? Language Model Secretly Perform Gradient Descent as Meta-Optimizers ⭐
- Emerging Ability of Large Language Models ⭐
- Rethinking the Role of Demonstrations What Makes incontext learning work? ⭐
- Can Explanations Be Useful for Calibrating Black Box Models
- IS CHATGPT A GENERAL-PURPOSE NATURAL LANGUAGE PROCESSING TASK SOLVER?
- Can Large Language Models Infer Causation from Correlation?
- Holistic Evaluation of Language Model
- Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
- GPT2: Language Models are Unsupervised Multitask Learners
- GPT3: Language Models are Few-Shot Learners ⭐
- LAMA: Language Models as Knowledge Bases?
- AutoPrompt: Eliciting Knowledge from Language Models
- T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- PET-TC(a): Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference ⭐
- PET-TC(b): PETSGLUE It’s Not Just Size That Matters Small Language Models are also few-shot learners
- GenPET: Few-Shot Text Generation with Natural Language Instructions
- LM-BFF: Making Pre-trained Language Models Better Few-shot Learners ⭐
- ADEPT: Improving and Simplifying Pattern Exploiting Training
- Prefix-tuning: Optimizing continuous prompts for generation
- Prompt-tunning: The power of scale for parameter-efficient prompt tuning ⭐
- P-tunning: GPT Understands Too ⭐
- WARP: Word-level Adversarial ReProgramming
- P-tunning v2: Prompt Tuning Can Be Comparable to Fine-tunning Universally Across Scales and Tasks
- PTR: Prompt Tuning with Rules for Text Classification
- PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains
- LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS ⭐
- LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
- Parameter-Efficient Transfer Learning for NLP
- INTRINSIC DIMENSIONALITY EXPLAINS THE EFFECTIVENESS OF LANGUAGE MODEL FINE-TUNING
- GLM-130B: AN OPEN BILINGUAL PRE-TRAINED MODEL
- LLaMA: Open and Efficient Foundation Language Models
- PaLM: Scaling Language Modeling with Pathways
- PaLM 2 Technical Report
- GPT-4 Technical Report
- Flan: FINETUNED LANGUAGE MODELS ARE ZERO-SHOT LEARNERS ⭐
- Flan-T5: Scaling Instruction-Finetuned Language Models
- Instruct-GPT: Training language models to follow instructions with human feedback ⭐
- T0: MULTITASK PROMPTED TRAINING ENABLES ZERO-SHOT TASK GENERALIZATION
- Natural Instructions: Cross-Task Generalization via Natural Language Crowdsourcing Instructions
- Tk-INSTRUCT: SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Tasks
- Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
- INSTRUCTEVAL Towards Holistic Evaluation of Instrucion-Tuned Large Language Models
- LaMDA: Language Models for Dialog Applications
- Sparrow: Improving alignment of dialogue agents via targeted human judgements ⭐
- BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage
- How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
- 基础&进阶用法
- [zero-shot-COT] Large Language Models are Zero-Shot Reasoners ⭐
- [few-shot COT] Chain of Thought Prompting Elicits Reasoning in Large Language Models ⭐
- SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS
- LEAST-TO-MOST PROMPTING ENABLES COMPLEX REASONING IN LARGE LANGUAGE MODELS ⭐
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models
- Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
- 分领域COT
- Solving Quantitative Reasoning Problems with Language Models
- COMPLEXITY-BASED PROMPTING FOR MULTI-STEP REASONING
- Solving math word problems with processand outcome-based feedback
- CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
- T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Large Language Model Signals for Science Question Answering
- LEARNING PERFORMANCE-IMPROVING CODE EDITS
- 原理分析
- Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters ⭐
- TEXT AND PATTERNS: FOR EFFECTIVE CHAIN OF THOUGHT IT TAKES TWO TO TANGO
- Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective
- 小模型COT蒸馏
- Specializing Smaller Language Models towards Multi-Step Reasoning ⭐
- Teaching Small Language Models to Reason
- Large Language Models are Reasoning Teachers
- Distilling Reasoning Capabilities into Smaller Language Models
- others
- OlaGPT Empowering LLMs With Human-like Problem-Solving abilities
- Decomposed Prompting A MODULAR APPROACH FOR Solving Complex Tasks
- Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
- Challenging BIG-Bench tasks and whether chain-of-thought can solve them
- SHOW YOUR WORK: SCRATCHPADS FOR INTERMEDIATE COMPUTATION WITH LANGUAGE MODELS
- STaR: Self-Taught Reasoner Bootstrapping ReasoningWith Reasoning
- AUTOMATIC CHAIN OF THOUGHT PROMPTING IN LARGE LANGUAGE MODELS
- Large Language Models Can Self-Improve
- Active Prompting with Chain-of-Thought for Large Language Models
- Deepmind
- Teaching language models to support answers with verified quotes
- sparrow, Improving alignment of dialogue agents via targetd human judgements ⭐
- openai
- PPO: Proximal Policy Optimization Algorithms ⭐
- Deep Reinforcement Learning for Human Preference
- Fine-Tuning Language Models from Human Preferences
- learning to summarize from human feedback
- InstructGPT: Training language models to follow instructions with human feedback ⭐
- Scaling Laws for Reward Model Over optimization ⭐
- Anthropic
- A General Language Assistant as a Laboratory for Alignmen
- Red Teaming Language Models to Reduce Harms Methods,Scaling Behaviors and Lessons Learned
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback ⭐
- Constitutional AI Harmlessness from AI Feedback ⭐
- Pretraining Language Models with Human Preferences
- AllenAI, RL4LM:IS REINFORCEMENT LEARNING (NOT) FOR NATURAL LANGUAGE PROCESSING BENCHMARKS
- RRHF: Rank Responses to Align Language Models with Human Feedback without tears
- PRM:Let's verify step by step
- Tool Former: Toolformer: Language Models Can Teach Themselves to Use Tools ⭐
- MRKL SystemsA modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
- ReAct: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS ⭐
- Self-ask: MEASURING AND NARROWING THE COMPOSITIONALITY GAP IN LANGUAGE MODELS ⭐
- PAL: Program-aided Language Models
- HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
- OpenAGI: When LLM Meets Domain Experts
- Tool Learning with Foundation Models
- Tool Maker: Large Language Models as Tool Maker
- Gorilla: Large Language Model Connected with Massive APIs ⭐
- Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
- ART: Automatic multi-step reasoning and tool-use for large language models
- Generated Knowledge Prompting for Commonsense Reasoning
- Evaluating Verifiability in Generative Search Engines
- Mind2Web: Towards a Generalist Agent for the Web
- ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models
- REPLUG: Retrieval-Augmented Black-Box Language Models
- TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
- WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences ⭐
- WebGPT:Browser-assisted question-answering with human feedback
- APE: LARGE LANGUAGE MODELS ARE HUMAN-LEVEL PROMPT ENGINEERS ⭐
- SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions ⭐
- iPrompt: Explaining Data Patterns in Natural Language via Interpretable Autoprompting
- Flipped Learning: Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
- Fairness-guided Few-shot Prompting for Large Language Models
- Instruction induction: From few examples to natural language task descriptions.
- Baize An Open-Source Chat Model with Parameter-Efficient Tuning on self-Chat Data
- SELF-QA Unsupervised Knowledge Guided alignment.
- GPT Self-Supervision for a Better Data Annotator
- BioGPT:Generative Pre-trained Transformer for Biomedical Text Generation and Mining
- Galactia:A Large Language Model for Science
- PubMed GPT: A Domain-specific large language model for biomedical text ⭐
- BloombergGPT: A Large Language Model for Finance
- ChatDoctor:Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge
- Med-PaLM:Large Language Models Encode Clinical Knowledge[V1,V2] ⭐
- Augmented Large Language Models with Parametric Knowledge Guiding
- XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters
- Parallel Context Windows for Large Language Models
- Structured Prompting: Scaling In-Context Learning to 1,000 Examples
- 苏剑林, NBCE:使用朴素贝叶斯扩展LLM的Context处理长度 ⭐
- Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
- Unlimiformer: Long-Range Transformers with Unlimited Length Input
- Scaling Transformer to 1M tokens and beyond with RMT
- RECURRENTGPT: Interactive Generation of (Arbitrarily) Long Text
- TRAIN SHORT, TEST LONG: ATTENTION WITH LINEAR BIASES ENABLES INPUT LENGTH EXTRAPOLATION ⭐
- 更少,质量更高的数据带来质变
- LIMA: Less Is More for Alignment ⭐
- Maybe Only 0.5% Data is Needed: A Preliminary Exploration of Low Training Data Instruction Tuning -Textbooks Are All You Need
- 其他
- BELLE: Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases
- Baize: Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
- A Comparative Study between Full-Parameter and LoRA-based Fine-Tuning on Chinese Instruction Data for Large LM
- Exploring ChatGPT’s Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
- Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
- Trusting Your Evidence: Hallucinate Less with Context-aware Decoding ⭐
- SELF-REFINE:ITERATIVE REFINEMENT WITH SELF-FEEDBACK ⭐
- PROMPTING GPT-3 TO BE RELIABLE
- Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
- On the Advance of Making Language Models Better Reasoners
- Progressive-Hint Prompting Improves Reasoning in Large Language Models
- ASK ME ANYTHING: A SIMPLE STRATEGY FOR PROMPTING LANGUAGE MODELS ⭐
- Calibrate Before Use: Improving Few-Shot Performance of Language Models
- In-Context Instruction Learning
- LEARNING PERFORMANCE-IMPROVING CODE EDITS
- Boosting Theory-of-Mind Performance in Large Language Models via Prompting
- InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
- Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
- PaLM-E: An Embodied Multimodal Language Model