Collection of papers and related works for Large Language Models (ChatGPT, GPT-3, Codex etc.).
This repository is contributed by the following contributors.
- Organizers: Guilin Qi (漆桂林), Xiaofang Qi (戚晓芳)
- Paper Collectors: Zafar Ali, Sheng Bi (毕胜), Yongrui Chen (陈永锐), Zizhuo Chen (陈孜卓), Xinbang Dai (戴鑫邦), Huan Gao (高桓), Shilong Hu (胡世龙), Jiaqi Li (李嘉琦), Dehai Min (闵德海), Yiming Tan (谭亦鸣), Tongtong Wu (吴桐桐), Songlin Zhai (翟松林), Yuxin Zhang (张裕欣)
- Maintainers: Runzhe Wang (王润哲), Shenyu Zhang (张沈昱)
The automation script of this repo is powered by Auto-Bibfile.
- [Overview] -- Homepage
- -- Summary
- -- Author
- -- Techniques
- -- Published Time
- -- Published Venue
- A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning,
Hallucination, and Interactivity,
by Yejin Bang, Samuel Cahyawijaya, Nayeon Lee, Wenliang Dai, Dan Su, Bryan Wilie, Holy Lovenia, Ziwei Ji et al.本文提出了一个使用公开数据集定量评估交互式LLM(如ChatGPT)的框架。我们使用涵盖8个不同的常见NLP应用任务的21个数据集对ChatGPT进行了广泛的技术评估。我们基于这些数据集和一个新设计的多模态数据集评估了ChatGPT的多任务、多语言和多模态方面。
- Is ChatGPT a General-Purpose Natural Language Processing Task Solver?,
by Qin, Chengwei, Zhang, Aston, Zhang, Zhuosheng, Chen, Jiaao, Yasunaga, Michihiro and Yang, Diyi - Holistic Evaluation of Language Models,
by Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan et al. - Evaluating the Text-to-SQL Capabilities of Large Language Models,
by Nitarshan Rajkumar, Raymond Li and Dzmitry Bahdanau - Are Visual-Linguistic Models Commonsense Knowledge Bases?,
by Hsiu-Yu Yang and Carina Silberer - Evaluating Large Language Models Trained on Code,
by Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond'e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda et al. - GLGE: A New General Language Generation Evaluation Benchmark,
by Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu et al. - Evaluating Pre-Trained Models for User Feedback Analysis in Software
Engineering: A Study on Classification of App-Reviews,
by Mohammad Abdul Hadi and Fatemeh H. Fard - Evaluation of Text Generation: A Survey,
by Asli Celikyilmaz, Elizabeth Clark and Jianfeng Gao - Neural Language Generation: Formulation, Methods, and Evaluation,
by Cristina Garbacea and Qiaozhu Mei - BERTScore: Evaluating Text Generation with BERT,
by Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger and Yoav Artzi
- A Survey for In-context Learning,
by Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu et al. - Explanation Selection Using Unlabeled Data for In-Context Learning,
by Xi Ye and Greg Durrett - In-Context Learning with Many Demonstration Examples,
by Mukai Li, Shansan Gong, Jiangtao Feng, Yiheng Xu, Jun Zhang, Zhiyong Wu and Lingpeng Kong - Meta-learning via Language Model In-context Tuning,
by Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis and He He - MetaICL: Learning to Learn In Context,
by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi - Selective Annotation Makes Language Models Better Few-Shot Learners,
by Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf et al. - Improving In-Context Few-Shot Learning via Self-Supervised Training,
by Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov and Zornitsa Kozareva - Instruction Induction: From Few Examples to Natural Language Task
Descriptions,
by Or Honovich, Uri Shaham, Samuel R. Bowman and Omer Levy - Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot
Prompt Order Sensitivity,
by Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel and Pontus Stenetorp - What Makes Good In-Context Examples for GPT-3?,
by Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin and Weizhu Chen - Learning To Retrieve Prompts for In-Context Learning,
by Ohad Rubin, Jonathan Herzig and Jonathan Berant - Active Example Selection for In-Context Learning,
by Yiming Zhang, Shi Feng and Chenhao Tan - Self-Generated In-Context Learning: Leveraging Auto-regressive Language
Models as a Demonstration Generator,
by Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo and Sang-goo Lee - Measuring Convergence Inertia: Online Learning in Self-adaptive Systems
with Context Shifts,
by Elvin Alberts and Ilias Gerostathopoulos - An Explanation of In-context Learning as Implicit Bayesian Inference,
by Sang Michael Xie, Aditi Raghunathan, Percy Liang and Tengyu Ma - Rethinking the Role of Demonstrations: What Makes In-Context Learning
Work?,
by Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi and Luke Zettlemoyer - The Impact of Symbolic Representations on In-context Learning for
Few-shot Reasoning,
by Hanlin Zhang, Yi-Fan Zhang, Li Erran Li and Eric P. Xing - Response Generation with Context-Aware Prompt Learning,
by Xiaodong Gu, Kang Min Yoo and Sang-Woo Lee
- Large Language Models Can Be Easily Distracted by Irrelevant Context,
by Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Sch"arli and Denny Zhou - Finetuned Language Models are Zero-Shot Learners,
by Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai et al. - LaMDA: Language Models for Dialog Applications,
by Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos et al. - Scaling Instruction-Finetuned Language Models,
by Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang et al. - Super-NaturalInstructions: Generalization via Declarative Instructions
on 1600+ NLP Tasks,
by Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Atharva Naik, Arjun Ashok, Arut Selvan Dhanasekaran et al. - Self-Instruct: Aligning Language Model with Self Generated Instructions,
by Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi and Hannaneh Hajishirzi
- Knowledge-enhanced Neural Machine Reasoning: A Review,
by Tanmoy Chowdhury, Chen Ling, Xuchao Zhang, Xujiang Zhao, Guangji Bai, Jian Pei, Haifeng Chen and Liang Zhao - Deep Bidirectional Language-Knowledge Graph Pretraining,
by Michihiro Yasunaga, Antoine Bosselut, Hongyu Ren, Xikun Zhang, Christopher D. Manning, Percy Liang and Jure LeskovecTODO: Update URL when formally published
- A Survey on Knowledge-Enhanced Pre-trained Language Models,
by Chaoqi Zhen, Yanlei Shang, Xiangyu Liu, Yifei Li, Yong Chen and Dell ZhangTODO: Update URL when formally published
- Review of Knowledge-Enhanced Pre-trained Language Models,
by Yi, HAN, Linbo, QIAO, Dongsheng, LI and Xiangke, LIAO - Mind the Knowledge Gap: A Survey of Knowledge-enhanced Dialogue
Systems,
by Sagi Shaier, Lawrence Hunter and Katharina Kann - Knowledge-based Review Generation by Coherence Enhanced Text Planning,
by Junyi Li, Wayne Xin Zhao, Zhicheng Wei, Nicholas Jing Yuan and Ji-Rong Wen - A Three-Stage Learning Framework for Low-Resource Knowledge-Grounded
Dialogue Generation,
by Shilei Liu, Xiaofeng Zhao, Bochao Li, Feiliang Ren, Longhui Zhang and Shujuan Yin - Ask what's missing and what's useful: Improving Clarification Question
Generation using Global Knowledge,
by Bodhisattwa Prasad Majumder, Sudha Rao, Michel Galley and Julian J. McAuley - ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language
Understanding and Generation,
by Yu Sun, Shuohuan Wang, Shikun Feng, Siyu Ding, Chao Pang, Junyuan Shang, Jiaxiang Liu, Xuyi Chen et al. - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation,
by Wenhu Chen, Yu Su, Xifeng Yan and William Yang Wang - A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation,
by Jian Guan, Fei Huang, Minlie Huang, Zhihao Zhao and Xiaoyan Zhu - Knowledge-Enhanced Personalized Review Generation with Capsule Graph
Neural Network,
by Junyi Li, Siqing Li, Wayne Xin Zhao, Gaole He, Zhicheng Wei, Nicholas Jing Yuan and Ji-Rong Wen - MEGATRON-CNTRL: Controllable Story Generation with External Knowledge
Using Large-Scale Language Models,
by Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Raul Puri, Pascale Fung, Anima Anandkumar and Bryan Catanzaro - Zero-shot Learning with Semantic Output Codes,
by Mark Palatucci, Dean Pomerleau, Geoffrey E. Hinton and Tom M. Mitchell
- Compressing Pre-trained Models of Code into 3 MB,
by Jieke Shi, Zhou Yang, Bowen Xu, Hong Jin Kang and David Lo - Preparing lessons: Improve knowledge distillation with better supervision, [Code]
by Tiancheng Wen, Shenqi Lai and Xueming Qian - Distilling Knowledge Learned in BERT for Text Generation,
by Yen-Chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu and Jingjing Liu - Improved Knowledge Distillation via Teacher Assistant, [Code]
by Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Nir Levine, Akihiro Matsukawa and Hassan Ghasemzadeh - Regularizing Class-Wise Predictions via Self-Knowledge Distillation, [Code]
by Sukmin Yun, Jongjin Park, Kimin Lee and Jinwoo Shin - Relational Knowledge Distillation, [Code]
by Wonpyo Park, Dongju Kim, Yan Lu and Minsu Cho - Revisit Knowledge Distillation: a Teacher-free Framework, [Code]
by Li Yuan, Francis E. H. Tay, Guilin Li, Tao Wang and Jiashi Feng - Knowledge Distillation via Route Constrained Optimization, [Code]
by Xiao Jin, Baoyun Peng, Yichao Wu, Yu Liu, Jiaheng Liu, Ding Liang, Junjie Yan and Xiaolin Hu - Improving Generalization and Robustness with Noisy Collaboration in
Knowledge Distillation, [Code]
by Elahe Arani, Fahad Sarfraz and Bahram Zonooz - Distilling Task-Specific Knowledge from BERT into Simple Neural
Networks, [Code]
by Raphael Tang, Yao Lu, Linqing Liu, Lili Mou, Olga Vechtomova and Jimmy Lin - The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, [Code]
by Jonathan Frankle and Michael Carbin - Born-Again Neural Networks, [Code]
by Tommaso Furlanello, Zachary Chase Lipton, Michael Tschannen, Laurent Itti and Anima Anandkumar - Paying More Attention to Attention: Improving the Performance of Convolutional
Neural Networks via Attention Transfer, [Code]
by Sergey Zagoruyko and Nikos Komodakis - Mean teachers are better role models: Weight-averaged consistency
targets improve semi-supervised deep learning results, [Code]
by Antti Tarvainen and Harri Valpola - Deep Mutual Learning, [Code]
by Ying Zhang, Tao Xiang, Timothy M. Hospedales and Huchuan Lu - Deep Model Compression: Distilling Knowledge from Noisy Teachers, [Code]
by Bharat Bhusan Sau and Vineeth N. Balasubramanian - Distilling the Knowledge in a Neural Network, [Code]
by Geoffrey E. Hinton, Oriol Vinyals and Jeffrey Dean
- Crawling the Internal Knowledge-Base of Language Models,
by Roi Cohen, Mor Geva, Jonathan Berant and Amir Globerson本文提出一种从语言模型中提取结构化知识图谱的方法;使用专门设计的提示来控制提取过程中的精度和召回率;在GPT-3上进行了评估,显示了高精确度的结果。 TODO: Update URL when formally published
- Generative Knowledge Graph Construction: A Review,
by Hongbin Ye, Ningyu Zhang, Hui Chen and Huajun Chen
- ThoughtSource: A central hub for large language model reasoning
data,
by Simon Ott, Konstantin Hebenstreit, Valentin Li'evin, Christoffer Egeberg Hother, Milad Moradi, Maximilian Mayrhauser, Robert Praas, Ole Winther et al. - Knowledge-enhanced Neural Machine Reasoning: A Review,
by Tanmoy Chowdhury, Chen Ling, Xuchao Zhang, Xujiang Zhao, Guangji Bai, Jian Pei, Haifeng Chen and Liang Zhao - Large Language Models are Versatile Decomposers: Decompose Evidence
and Questions for Table-based Reasoning,
by Yunhu Ye, Binyuan Hui, Min Yang, Binhua Li, Fei Huang and Yongbin Li - Specializing Smaller Language Models towards Multi-Step Reasoning,
by Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal and Tushar Khot - Improved logical reasoning of language models via differentiable symbolic programming,
by Zhang, Hanlin, Li, Ziyang, Huang, Jiani, Naik, Mayur and Xing, Eric - LILA: A Unified Benchmark for Mathematical Reasoning,
by Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord et al. - Maieutic Prompting: Logically Consistent Reasoning with Recursive
Explanations,
by Jaehun Jung, Lianhui Qin, Sean Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras and Yejin Choi - MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
Generation,
by Swarnadeep Saha, Xinyan Velocity Yu, Mohit Bansal, Ramakanth Pasunuru and Asli Celikyilmaz - Program of Thoughts Prompting: Disentangling Computation from Reasoning
for Numerical Reasoning Tasks,
by Wenhu Chen, Xueguang Ma, Xinyi Wang and William W. Cohen - The Impact of Symbolic Representations on In-context Learning for
Few-shot Reasoning,
by Hanlin Zhang, Yi-Fan Zhang, Li Erran Li and Eric P. Xing - Towards Reasoning in Large Language Models: A Survey,
by Jie Huang and Kevin Chen-Chuan Chang - UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical
Expression,
by Jiaqi Chen, Tong Li, Jinghui Qin, Pan Lu, Liang Lin, Chongyu Chen and Xiaodan Liang - Least-to-Most Prompting Enables Complex Reasoning in Large Language
Models,
by Denny Zhou, Nathanael Sch"arli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Olivier Bousquet et al. - Rationale-Augmented Ensembles in Language Models,
by Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc V. Le, Ed H. Chi and Denny Zhou
- Multimodal Chain-of-Thought Reasoning in Language Models,
by Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis and Alex Smola - Rethinking with Retrieval: Faithful Large Language Model Inference,
by Hangfeng He, Hongming Zhang and Dan Roth本文通过用GPT-3在三个复杂的推理任务:常识推理,时间推理和表格推理上进行大量实验来评估RR的有效性。结果表明,RR可以产生更忠实的解释,并提高LLM的性能。TODO: Update URL when formally published
- Instruction Induction: From Few Examples to Natural Language Task
Descriptions,
by Or Honovich, Uri Shaham, Samuel R. Bowman and Omer Levy - Iteratively Prompt Pre-trained Language Models for Chain of Thought,
by Boshi Wang, Xiang Deng and Huan Sun - Complexity-Based Prompting for Multi-Step Reasoning,
by Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark and Tushar Khot - Measuring and Narrowing the Compositionality Gap in Language Models,
by Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah A. Smith and Mike Lewis - Automatic Chain of Thought Prompting in Large Language Models,
by Zhuosheng Zhang, Aston Zhang, Mu Li and Alex Smola - Chain of Thought Prompting Elicits Reasoning in Large Language Models,
by Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed H. Chi, Quoc Le and Denny Zhou - Self-Consistency Improves Chain of Thought Reasoning in Language Models,
by Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc V. Le, Ed H. Chi and Denny Zhou - Text and Patterns: For Effective Chain of Thought, It Takes Two to
Tango,
by Aman Madaan and Amir Yazdanbakhsh - Towards Understanding Chain-of-Thought Prompting: An Empirical Study
of What Matters,
by Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer and Huan Sun - PaLM: Scaling Language Modeling with Pathways,
by Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung et al. - LAMBADA: Backward Chaining for Automated Reasoning in Natural Language,
by Seyed Mehran Kazemi, Najoung Kim, Deepti Bhatia, Xin Xu and Deepak Ramachandran
- Specializing Smaller Language Models towards Multi-Step Reasoning,
by Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal and Tushar Khot - MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
Generation,
by Swarnadeep Saha, Xinyan Velocity Yu, Mohit Bansal, Ramakanth Pasunuru and Asli Celikyilmaz - Rationale-Augmented Ensembles in Language Models,
by Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc V. Le, Ed H. Chi and Denny Zhou
- LILA: A Unified Benchmark for Mathematical Reasoning,
by Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord et al. - Program of Thoughts Prompting: Disentangling Computation from Reasoning
for Numerical Reasoning Tasks,
by Wenhu Chen, Xueguang Ma, Xinyi Wang and William W. Cohen
- Improved logical reasoning of language models via differentiable symbolic programming,
by Zhang, Hanlin, Li, Ziyang, Huang, Jiani, Naik, Mayur and Xing, Eric - Maieutic Prompting: Logically Consistent Reasoning with Recursive
Explanations,
by Jaehun Jung, Lianhui Qin, Sean Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras and Yejin Choi - The Impact of Symbolic Representations on In-context Learning for
Few-shot Reasoning,
by Hanlin Zhang, Yi-Fan Zhang, Li Erran Li and Eric P. Xing - UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical
Expression,
by Jiaqi Chen, Tong Li, Jinghui Qin, Pan Lu, Liang Lin, Chongyu Chen and Xiaodan Liang
- Fairness and accuracy in horizontal federated learning,
by Wei Huang, Tianrui Li, Dexian Wang, Shengdong Du, Junbo Zhang and Tianqiang Huang - Federated Learning Meets Multi-Objective Optimization,
by Zeou Hu, Kiarash Shaloudegi, Guojun Zhang and Yaoliang Yu - From distributed machine learning to federated learning: a survey,
by Ji Liu, Jizhou Huang, Yang Zhou, Xuhong Li, Shilei Ji, Haoyi Xiong and Dejing Dou - Meta-Learning Based Knowledge Extrapolation for Knowledge Graphs in
the Federated Setting,
by Mingyang Chen, Wen Zhang, Zhen Yao, Xiangnan Chen, Mengxiao Ding, Fei Huang and Huajun Chen - Mitigating Biases in Student Performance Prediction via Attention-Based
Personalized Federated Learning,
by Yun-Wei Chu, Seyyedali Hosseinalipour, Elizabeth Tenorio, Laura M. Cruz Castro, Kerrie A. Douglas, Andrew Lan and Christopher G. Brinton - Pretrained Models for Multilingual Federated Learning,
by Orion Weller, Marc Marone, Vladimir Braverman, Dawn J. Lawrie and Benjamin Van Durme - Rethinking Architecture Design for Tackling Data Heterogeneity in
Federated Learning,
by Liangqiong Qu, Yuyin Zhou, Paul Pu Liang, Yingda Xia, Feifei Wang, Ehsan Adeli, Li Fei-Fei and Daniel L. Rubin - FedBERT: When Federated Learning Meets Pre-training,
by Yuanyishu Tian, Yao Wan, Lingjuan Lyu, Dezhong Yao, Hai Jin and Lichao Sun - Where to Begin? On the Impact of Pre-Training and Initialization in
Federated Learning,
by John Nguyen, Jianyu Wang, Kshitiz Malik, Maziar Sanjabi and Michael Rabbat - Ditto: Fair and Robust Federated Learning Through Personalization,
by Tian Li, Shengyuan Hu, Ahmad Beirami and Virginia Smith - Fine-tuning is Fine in Federated Learning,
by Gary Cheng, Karan N. Chadha and John C. Duchi - Federated Learning with Fair Averaging,
by Zheng Wang, Xiaoliang Fan, Jianzhong Qi, Chenglu Wen, Cheng Wang and Rongshan Yu - Collaborative Fairness in Federated Learning,
by Lingjuan Lyu, Xinyi Xu, Qian Wang and Han Yu - Federated Visual Classification with Real-World Data Distribution,
by Tzu-Ming Harry Hsu, Hang Qi and Matthew Brown
- Distributed Training of Knowledge Graph Embedding Models using Ray,
by Nasrullah Sheikh, Xiao Qin, Yaniv Gur and Berthold Reinwald - Distributed Learning With Sparsified Gradient Differences,
by Yicheng Chen, Rick S. Blum, Martin Tak'ac and Brian M. Sadler - Graph Attention Neural Network Distributed Model Training,
by Esmaeilzadeh, Armin, Zadeh Nojoo Kambar, Mina Esmail and Heidari, Maryam - Elastic Deep Learning Using Knowledge Distillation with Heterogeneous
Computing Resources,
by Daxiang Dong, Ji Liu, Xi Wang, Weibao Gong, An Qin, Xingjian Li, Dianhai Yu, Patrick Valduriez et al. - GRACE: A Compressed Communication Framework for Distributed Machine
Learning,
by Hang Xu, Chen-Yu Ho, Ahmed M. Abdelmoniem, Aritra Dutta, El Houcine Bergou, Konstantinos Karatsenidis, Marco Canini and Panos Kalnis - Load Balancing Optimization for Transformer in Distributed Environment,
by Delu Ma, Zhou Lei, Shengbo Chen and Peng Wang - DistDGL: Distributed Graph Neural Network Training for Billion-Scale
Graphs,
by Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang et al. - PyTorch Distributed: Experiences on Accelerating Data Parallel Training,
by Shen Li, Yanli Zhao, Rohan Varma, Omkar Salpekar, Pieter Noordhuis, Teng Li, Adam Paszke, Jeff Smith et al. - Ray: A Distributed Framework for Emerging AI Applications,
by Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang et al.
- Selective Annotation Makes Language Models Better Few-Shot Learners,
by Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf et al. - Selective Data Acquisition in the Wild for Model Charging,
by Chengliang Chai, Jiabin Liu, Nan Tang, Guoliang Li and Yuyu Luo
- On Robustness of Prompt-based Semantic Parsing with Large Pre-trained
Language Model: An Empirical Study on Codex,
by Terry Yue Zhuo, Zhuang Li, Yujin Huang, Yuan-Fang Li, Weiqing Wang, Gholamreza Haffari and Fatemeh Shiri - CodeT5Mix: A Pretrained Mixture of Encoder-decoder Transformers for Code Understanding and Generation,
by Wang, Yue, Le, Hung, Gotmare, Akhilesh Deepak, Li, Junnan and Hoi, Steven - CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code,
by Zhou, Shuyan, Alon, Uri, Agarwal, Sumit and Neubig, Graham - Code4Struct: Code Generation for Few-Shot Structured Prediction from
Natural Language,
by Xingyao Wang, Sha Li and Heng Ji - Language Models of Code are Few-Shot Commonsense Learners,
by Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang and Graham Neubig - When Neural Model Meets NL2Code: A Survey,
by Daoguang Zan, Bei Chen, Fengji Zhang, Dianjie Lu, Bingchao Wu, Bei Guan, Yongji Wang and Jian-Guang Lou - Evaluating the Text-to-SQL Capabilities of Large Language Models,
by Nitarshan Rajkumar, Raymond Li and Dzmitry Bahdanau - An extensive study on pre-trained models for program understanding
and generation,
by Zhengran Zeng, Hanzhuo Tan, Haotian Zhang, Jing Li, Yuqun Zhang and Lingming Zhang - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning,
by Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese and Steven C. H. Hoi - CoditT5: Pretraining for Source Code and Natural Language Editing,
by Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li and Milos Gligoric - Compressing Pre-trained Models of Code into 3 MB,
by Jieke Shi, Zhou Yang, Bowen Xu, Hong Jin Kang and David Lo - Diet code is healthy: simplifying programs for pre-trained models
of code,
by Zhaowei Zhang, Hongyu Zhang, Beijun Shen and Xiaodong Gu - NatGen: generative pre-training by "naturalizing" source code,
by Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar T. Devanbu and Baishakhi Ray - Jigsaw: Large Language Models meet Program Synthesis,
by Naman Jain, Skanda Vaidyanath, Arun Shankar Iyer, Nagarajan Natarajan, Suresh Parthasarathy, Sriram K. Rajamani and Rahul Sharma - Natural Attack for Pre-trained Models of Code,
by Zhou Yang, Jieke Shi, Junda He and David Lo - Automatic Generation of Programming Exercises and Code Explanations
Using Large Language Models,
by Sami Sarsa, Paul Denny, Arto Hellas and Juho Leinonen - Evaluating Large Language Models Trained on Code,
by Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond'e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda et al. - CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models
for Code Understanding and Generation,
by Yue Wang, Weishi Wang, Shafiq R. Joty and Steven C. H. Hoi - CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding
and Generation,
by Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin B. Clement, Dawn Drain et al. - Unified Pre-training for Program Understanding and Generation,
by Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray and Kai-Wei Chang - Traceability Transformed: Generating more Accurate Links with Pre-Trained
BERT Models,
by Jinfeng Lin, Yalin Liu, Qingkai Zeng, Meng Jiang and Jane Cleland-Huang - IntelliCode compose: code generation using transformer,
by Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu and Neel Sundaresan - Multi-task Learning based Pre-trained Language Model for Code Completion,
by Fang Liu, Ge Li, Yunfei Zhao and Zhi Jin
- CODE-MVP: Learning to Represent Source Code from Multiple Views
with Contrastive Pre-Training,
by Xin Wang, Yasheng Wang, Yao Wan, Jiawei Wang, Pingyi Zhou, Li Li, Hao Wu and Jin Liu - UniXcoder: Unified Cross-Modal Pre-training for Code Representation,
by Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou and Jian Yin - AST-Probe: Recovering abstract syntax trees from hidden representations
of pre-trained language models,
by Jos'e Antonio Hern'andez L'opez, Martin Weyssow, Jes'us S'anchez Cuadrado and Houari A. Sahraoui - GraphCodeBERT: Pre-training Code Representations with Data Flow,
by Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan et al. - CLSEBERT: Contrastive Learning for Syntax Enhanced Code Pre-Trained
Model,
by Xin Wang, Yasheng Wang, Pingyi Zhou, Fei Mi, Meng Xiao, Yadao Wang, Li Li, Xiao Liu et al.
- CIRCLE: continual repair across programming languages,
by Wei Yuan, Quanjun Zhang, Tieke He, Chunrong Fang, Nguyen Quoc Viet Hung, Xiaodong Hao and Hongzhi Yin - Detect-Localize-Repair: A Unified Framework for Learning to Debug
with CodeT5,
by Nghi Bui, Yue Wang and Steven C. H. Hoi - Multi-view Pre-trained Model for Code Vulnerability Identification,
by Xuxiang Jiang, Yinhao Xiao, Jun Wang and Wei Zhang - Towards JavaScript program repair with Generative Pre-trained Transformer
(GPT-2),
by M'ark Lajk'o, Viktor Csuvik and L'aszl'o Vid'acs - Applying CodeBERT for Automated Program Repair of Java Simple Bugs,
by Ehsan Mashhadi and Hadi Hemmati - A model with iterative trials for correcting logic errors in source code,
by Matsumoto, Taku, Watanobe, Yutaka and Nakamura, Keita - DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation,
and Code Skeletons,
by Dawn Drain, Colin B. Clement, Guillermo Serrato and Neel Sundaresan
- AUGER: automatically generating review comments with pre-training
models,
by Lingwei Li, Li Yang, Huaxi Jiang, Jun Yan, Tiejian Luo, Zihan Hua, Geng Liang and Chun Zuo - Automating code review activities by large-scale pre-training,
by Zhiyu Li, Shuai Lu, Daya Guo, Nan Duan, Shailesh Jannu, Grant Jenks, Deep Majumder, Jared Green et al. - Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding,
by Deze Wang, Zhouyang Jia, Shanshan Li, Yue Yu, Yun Xiong, Wei Dong and Xiangke Liao - Using Pre-Trained Models to Boost Code Review Automation,
by Rosalia Tufano, Simone Masiero, Antonio Mastropaolo, Luca Pascarella, Denys Poshyvanyk and Gabriele Bavota - What Do They Capture? - A Structural Analysis of Pre-Trained Language
Models for Source Code,
by Yao Wan, Wei Zhao, Hongyu Zhang, Yulei Sui, Guandong Xu and Hai Jin
- Deep Learning Meets Software Engineering: A Survey on Pre-Trained
Models of Source Code,
by Changan Niu, Chuanyi Li, Bin Luo and Vincent Ng - Do Pre-trained Language Models Indeed Understand Software Engineering
Tasks?,
by Yao Li, Tao Zhang, Xiapu Luo, Haipeng Cai, Sen Fang and Dawei Yuan - Evaluating Pre-Trained Models for User Feedback Analysis in Software
Engineering: A Study on Classification of App-Reviews,
by Mohammad Abdul Hadi and Fatemeh H. Fard - What do pre-trained code models know about code?,
by Anjan Karmakar and Romain Robbes - Sentiment analysis for software engineering: How far can pre-trained transformer models go?,
by Zhang, Ting, Xu, Bowen, Thung, Ferdian, Haryono, Stefanus Agus, Lo, David and Jiang, Lingxiao
- News Summarization and Evaluation in the Era of GPT-3,
by Tanya Goyal, Junyi Jessy Li and Greg Durrett - Fine-Grained Controllable Text Generation Using Non-Residual Prompting,
by Fredrik Carlsson, Joey "Ohman, Fangyu Liu, Severine Verlinden, Joakim Nivre and Magnus Sahlgren - The survey: Text generation models in deep learning,
by Touseef Iqbal and Shaima Qureshi - FewshotQA: A simple framework for few-shot learning of question
answering tasks using pre-trained text-to-text models,
by Rakesh Chada and Pradeep Natarajan - Controllable Open-ended Question Generation with A New Question
Type Ontology,
by Shuyang Cao and Lu Wang - All NLP Tasks Are Generation Tasks: A General Pretraining Framework,
by Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang and Jie Tang - Smelting Gold and Silver for Improved Multilingual AMR-to-Text Generation,
by Leonardo F. R. Ribeiro, Jonas Pfeiffer, Yue Zhang and Iryna Gurevych - PRAL: A Tailored Pre-Training Model for Task-Oriented Dialog Generation,
by Jing Gu, Qingyang Wu, Chongruo Wu, Weiyan Shi and Zhou Yu - DialogBERT: Discourse-Aware Response Generation via Learning to Recover
and Rank Utterances,
by Xiaodong Gu, Kang Min Yoo and Jung-Woo Ha - Response Generation with Context-Aware Prompt Learning,
by Xiaodong Gu, Kang Min Yoo and Sang-Woo Lee - DYPLOC: Dynamic Planning of Content Using Mixed Language Models
for Text Generation,
by Xinyu Hua, Ashwin Sreevatsa and Lu Wang - Latent Reasoning for Low-Resource Question Generation,
by Xinting Huang, Jianzhong Qi, Yu Sun and Rui Zhang - JointGT: Graph-Text Joint Representation Learning for Text Generation
from Knowledge Graphs,
by Pei Ke, Haozhe Ji, Yu Ran, Xin Cui, Liwei Wang, Linfeng Song, Xiaoyan Zhu and Minlie Huang - A Distributional Approach to Controlled Text Generation,
by Muhammad Khalifa, Hady Elsahar and Marc Dymetman - TextBox: A Unified, Modularized, and Extensible Framework for Text
Generation,
by Junyi Li, Tianyi Tang, Gaole He, Jinhao Jiang, Xiaoxuan Hu, Puzhao Xie, Zhipeng Chen, Zhuohao Yu et al. - Few-shot Knowledge Graph-to-Text Generation with Pretrained Language
Models,
by Junyi Li, Tianyi Tang, Wayne Xin Zhao, Zhicheng Wei, Nicholas Jing Yuan and Ji-Rong Wen - Knowledge-based Review Generation by Coherence Enhanced Text Planning,
by Junyi Li, Wayne Xin Zhao, Zhicheng Wei, Nicholas Jing Yuan and Ji-Rong Wen - Prefix-Tuning: Optimizing Continuous Prompts for Generation,
by Xiang Lisa Li and Percy Liang - GLGE: A New General Language Generation Evaluation Benchmark,
by Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu et al. - A Three-Stage Learning Framework for Low-Resource Knowledge-Grounded
Dialogue Generation,
by Shilei Liu, Xiaofeng Zhao, Bochao Li, Feiliang Ren, Longhui Zhang and Shujuan Yin - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation,
by Fuli Luo, Wei Wang, Jiahao Liu, Yijia Liu, Bin Bi, Songfang Huang, Fei Huang and Luo Si - Ask what's missing and what's useful: Improving Clarification Question
Generation using Global Knowledge,
by Bodhisattwa Prasad Majumder, Sudha Rao, Michel Galley and Julian J. McAuley - ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language
Generation,
by Kaushal Kumar Maurya, Maunendra Sankar Desarkar, Yoshinobu Kano and Kumari Deepshikha - A Plug-and-Play Method for Controlled Text Generation,
by Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell and Roger Wattenhofer - Structural Adapters in Pretrained Language Models for AMR-to-Text
Generation,
by Leonardo F. R. Ribeiro, Yue Zhang and Iryna Gurevych - Towards Table-to-Text Generation with Numerical Reasoning,
by Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura and Hiroya Takamura - ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language
Understanding and Generation,
by Yu Sun, Shuohuan Wang, Shikun Feng, Siyu Ding, Chao Pang, Junyuan Shang, Jiaxiang Liu, Xuyi Chen et al. - Progressive Generation of Long Text with Pretrained Language Models,
by Bowen Tan, Zichao Yang, Maruan Al-Shedivat, Eric P. Xing and Zhiting Hu - Consistency and Coherency Enhanced Story Generation,
by Wei Wang, Piji Li and Hai-Tao Zheng - Structure-Aware Pre-Training for Table-to-Text Generation,
by Xinyu Xing and Xiaojun Wan - AugNLG: Few-shot Natural Language Generation using Self-trained Data
Augmentation,
by Xinnuo Xu, Guoyin Wang, Young-Bum Kim and Sungjin Lee - DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling,
by Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang and Tie-Yan Liu - FastSeq: Make Sequence Generation Faster,
by Yu Yan, Fei Hu, Jiusheng Chen, Nikhil Bhendawade, Ting Ye, Yeyun Gong, Nan Duan, Desheng Cui et al. - A Simple and Efficient Multi-Task Learning Approach for Conditioned
Dialogue Generation,
by Yan Zeng and Jian-Yun Nie - DSGPT: Domain-Specific Generative Pre-Training of Transformers for
Text Generation in E-commerce Title and Review Summarization,
by Xueying Zhang, Yunjiang Jiang, Yue Shang, Zhaomeng Cheng, Chi Zhang, Xiaochuan Fan, Yun Xiao and Bo Long - Language Models are Few-Shot Learners,
by Brown, Tom B, Mann, Benjamin, Ryder, Nick, Subbiah, Melanie, Kaplan, Jared, Dhariwal, Prafulla, Neelakantan, Arvind, Shyam, Pranav et al. - PLATO: Pre-trained Dialogue Generation Model with Discrete Latent
Variable,
by Siqi Bao, Huang He, Fan Wang, Hua Wu and Haifeng Wang - Evaluation of Text Generation: A Survey,
by Asli Celikyilmaz, Elizabeth Clark and Jianfeng Gao - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation,
by Wenhu Chen, Yu Su, Xifeng Yan and William Yang Wang - Distilling Knowledge Learned in BERT for Text Generation,
by Yen-Chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu and Jingjing Liu - Logic2Text: High-Fidelity Natural Language Generation from Logical
Forms,
by Zhiyu Chen, Wenhu Chen, Hanwen Zha, Xiyou Zhou, Yunkai Zhang, Sairam Sundaresan and William Yang Wang - Cross-Lingual Natural Language Generation via Pre-Training,
by Zewen Chi, Li Dong, Furu Wei, Wenhui Wang, Xian-Ling Mao and Heyan Huang - Plug and Play Language Models: A Simple Approach to Controlled Text
Generation,
by Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski and Rosanne Liu - Neural Language Generation: Formulation, Methods, and Evaluation,
by Cristina Garbacea and Qiaozhu Mei - TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction
and Content Matching,
by Heng Gong, Yawei Sun, Xiaocheng Feng, Bing Qin, Wei Bi, Xiaojiang Liu and Ting Liu - A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation,
by Jian Guan, Fei Huang, Minlie Huang, Zhihao Zhao and Xiaoyan Zhu - Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation
with Semantic Fidelity,
by Hamza Harkous, Isabel Groves and Amir Saffari - Reformulating Unsupervised Style Transfer as Paraphrase Generation,
by Kalpesh Krishna, John Wieting and Mohit Iyyer - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
Generation, Translation, and Comprehension,
by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov and Luke Zettlemoyer - Knowledge-Enhanced Personalized Review Generation with Capsule Graph
Neural Network,
by Junyi Li, Siqing Li, Wayne Xin Zhao, Gaole He, Zhicheng Wei, Nicholas Jing Yuan and Ji-Rong Wen - Rigid Formats Controlled Text Generation,
by Piji Li, Haisong Zhang, Xiaojiang Liu and Shuming Shi - UniViLM: A Unified Video and Language Pre-Training Model for Multimodal
Understanding and Generation,
by Huaishao Luo, Lei Ji, Botian Shi, Haoyang Huang, Nan Duan, Tianrui Li, Xilin Chen and Ming Zhou - GPT-too: A Language-Model-First Approach for AMR-to-Text Generation,
by Manuel Mager, Ram'on Fernandez Astudillo, Tahira Naseem, Md. Arafat Sultan, Young-Suk Lee, Radu Florian and Salim Roukos - Few-shot Natural Language Generation for Task-Oriented Dialog,
by Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng and Jianfeng Gao - PlotMachines: Outline-Conditioned Generation with Dynamic Plot State
Tracking,
by Hannah Rashkin, Asli Celikyilmaz, Yejin Choi and Jianfeng Gao - Investigating Pretrained Language Models for Graph-to-Text Generation,
by Leonardo F. R. Ribeiro, Martin Schmitt, Hinrich Sch"utze and Iryna Gurevych - Leveraging Pre-trained Checkpoints for Sequence Generation Tasks,
by Sascha Rothe, Shashi Narayan and Aliaksei Severyn - T3: Tree-Autoencoder Constrained Adversarial Text Generation for
Targeted Attack,
by Boxin Wang, Hengzhi Pei, Boyuan Pan, Qian Chen, Shuohang Wang and Bo Li - MEGATRON-CNTRL: Controllable Story Generation with External Knowledge
Using Large-Scale Language Models,
by Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Raul Puri, Pascale Fung, Anima Anandkumar and Bryan Catanzaro - StyleDGPT: Stylized Response Generation with Pre-trained Language
Models,
by Ze Yang, Wei Wu, Can Xu, Xinnian Liang, Jiaqi Bai, Liran Wang, Wei Wang and Zhoujun Li - Generalized Conditioned Dialogue Generation Based on Pre-trained Language
Model,
by Yan Zeng and Jian-Yun Nie - BERTScore: Evaluating Text Generation with BERT,
by Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger and Yoav Artzi - DIALOGPT : Large-Scale Generative Pre-training for Conversational
Response Generation,
by Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu et al. - Language Models are Unsupervised Multitask Learners,
by Radford, Alec, Wu, Jeffrey, Child, Rewon, Luan, David, Amodei, Dario and Sutskever, Ilya - Unified Language Model Pre-training for Natural Language Understanding
and Generation,
by Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou et al. - Large-Scale Transfer Learning for Natural Language Generation,
by Sergey Golovanov, Rauf Kurbanov, Sergey I. Nikolenko, Kyryl Truskovskyi, Alexander Tselousov and Thomas Wolf - CTRL: A Conditional Transformer Language Model for Controllable
Generation,
by Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong and Richard Socher - Improving Neural Story Generation by Targeted Common Sense Grounding,
by Huanru Henry Mao, Bodhisattwa Prasad Majumder, Julian J. McAuley and Garrison W. Cottrell - MASS: Masked Sequence to Sequence Pre-training for Language Generation,
by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu and Tie-Yan Liu - Generating Wikipedia by Summarizing Long Sequences,
by Peter J. Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser and Noam Shazeer - Improving language understanding by generative pre-training,
by Radford, Alec, Narasimhan, Karthik, Salimans, Tim, Sutskever, Ilya and others
- Controllable Open-ended Question Generation with A New Question
Type Ontology,
by Shuyang Cao and Lu Wang - A Distributional Approach to Controlled Text Generation,
by Muhammad Khalifa, Hady Elsahar and Marc Dymetman - A Plug-and-Play Method for Controlled Text Generation,
by Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell and Roger Wattenhofer - Plug and Play Language Models: A Simple Approach to Controlled Text
Generation,
by Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski and Rosanne Liu - Rigid Formats Controlled Text Generation,
by Piji Li, Haisong Zhang, Xiaojiang Liu and Shuming Shi - CTRL: A Conditional Transformer Language Model for Controllable
Generation,
by Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong and Richard Socher
- Learning to Prompt for Continual Learning,
by Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot et al. - A Continual Learning Survey: Defying Forgetting in Classification
Tasks,
by Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Ales Leonardis, Gregory G. Slabaugh and Tinne Tuytelaars - ELLE: Efficient Lifelong Pre-training for Emerging Data,
by Yujia Qin, Jiajie Zhang, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun and Jie Zhou - Lifelong Pretraining: Continually Adapting Language Models to Emerging
Corpora,
by Xisen Jin, Dejiao Zhang, Henghui Zhu, Wei Xiao, Shang-Wen Li, Xiaokai Wei, Andrew O. Arnold and Xiang Ren - Towards Continual Reinforcement Learning: A Review and Perspectives,
by Khimya Khetarpal, Matthew Riemer, Irina Rish and Doina Precup - Continual Pre-training of Language Models for Math Problem Understanding
with Syntax-Aware Memory Network,
by Zheng Gong, Kun Zhou, Xin Zhao, Jing Sha, Shijin Wang and Ji-Rong Wen
- On Robustness of Prompt-based Semantic Parsing with Large Pre-trained
Language Model: An Empirical Study on Codex,
by Terry Yue Zhuo, Zhuang Li, Yujin Huang, Yuan-Fang Li, Weiqing Wang, Gholamreza Haffari and Fatemeh Shiri - Learning to Prompt for Continual Learning,
by Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot et al. - Do Prompt-Based Models Really Understand the Meaning of Their Prompts?,
by Albert Webson and Ellie Pavlick - Large Language Models Are Human-Level Prompt Engineers,
by Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan and Jimmy Ba - An Information-theoretic Approach to Prompt Engineering Without Ground
Truth Labels,
by Taylor Sorensen, Joshua Robinson, Christopher Michael Rytting, Alexander Glenn Shaw, Kyle Jeffrey Rogers, Alexia Pauline Delorey, Mahmoud Khalil, Nancy Fulda et al. - Demystifying Prompts in Language Models via Perplexity Estimation,
by Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith and Luke Zettlemoyer - Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with
Language Models,
by Robert L. Logan IV, Ivana Balazevic, Eric Wallace, Fabio Petroni, Sameer Singh and Sebastian Riedel - Adversarial Soft Prompt Tuning for Cross-Domain Sentiment Analysis,
by Hui Wu and Xiaodong Shi - Fine-Grained Controllable Text Generation Using Non-Residual Prompting,
by Fredrik Carlsson, Joey "Ohman, Fangyu Liu, Severine Verlinden, Joakim Nivre and Magnus Sahlgren - MSP: Multi-Stage Prompting for Making Pre-trained Language Models
Better Translators,
by Zhixing Tan, Xiangwen Zhang, Shuo Wang and Yang Liu - Noisy Channel Language Model Prompting for Few-Shot Text Classification,
by Sewon Min, Mike Lewis, Hannaneh Hajishirzi and Luke Zettlemoyer - SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer,
by Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou' and Daniel Cer - Delta Tuning: A Comprehensive Study of Parameter Efficient Methods
for Pre-trained Language Models,
by Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen et al. - Meta-Adapters: Parameter Efficient Few-shot Fine-tuning through Meta-Learning,
by Trapit Bansal, Salaheddin Alzubi, Tong Wang, Jay-Yoon Lee and Andrew McCallum - Sparse Structure Search for Delta Tuning,
by Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu and Maosong Sun - Ontology-enhanced Prompt-tuning for Few-shot Learning,
by Hongbin Ye, Ningyu Zhang, Shumin Deng, Xiang Chen, Hui Chen, Feiyu Xiong, Xi Chen and Huajun Chen - Pre-trained Language Models can be Fully Zero-Shot Learners,
by Xuandong Zhao, Siqi Ouyang, Zhiguo Yu, Ming Wu and Lei Li - Least-to-Most Prompting Enables Complex Reasoning in Large Language
Models,
by Denny Zhou, Nathanael Sch"arli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Olivier Bousquet et al. - FewshotQA: A simple framework for few-shot learning of question
answering tasks using pre-trained text-to-text models,
by Rakesh Chada and Pradeep Natarajan - The Power of Scale for Parameter-Efficient Prompt Tuning,
by Brian Lester, Rami Al-Rfou and Noah Constant - Prefix-Tuning: Optimizing Continuous Prompts for Generation,
by Xiang Lisa Li and Percy Liang
- VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation,
by Fuli Luo, Wei Wang, Jiahao Liu, Yijia Liu, Bin Bi, Songfang Huang, Fei Huang and Luo Si - Unified Language Model Pre-training for Natural Language Understanding
and Generation,
by Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou et al. - Improving language understanding by generative pre-training,
by Radford, Alec, Narasimhan, Karthik, Salimans, Tim, Sutskever, Ilya and others
- CLIP-Event: Connecting Text and Images with Event Structures,
by Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji et al. - CLSEBERT: Contrastive Learning for Syntax Enhanced Code Pre-Trained
Model,
by Xin Wang, Yasheng Wang, Pingyi Zhou, Fei Mi, Meng Xiao, Yadao Wang, Li Li, Xiao Liu et al. - Less Is More: ClipBERT for Video-and-Language Learning via Sparse
Sampling,
by Jie Lei, Linjie Li, Luowei Zhou, Zhe Gan, Tamara L. Berg, Mohit Bansal and Jingjing Liu - Transformer is All You Need: Multimodal Multitask Learning with a
Unified Transformer,
by Ronghang Hu and Amanpreet Singh - Pre-training Graph Transformer with Multimodal Side Information for
Recommendation,
by Yong Liu, Susen Yang, Chenyi Lei, Guoxin Wang, Haihong Tang, Juyong Zhang, Aixin Sun and Chunyan Miao - UniViLM: A Unified Video and Language Pre-Training Model for Multimodal
Understanding and Generation,
by Huaishao Luo, Lei Ji, Botian Shi, Haoyang Huang, Nan Duan, Tianrui Li, Xilin Chen and Ming Zhou - Large-Scale Adversarial Training for Vision-and-Language Representation
Learning,
by Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng and Jingjing Liu - Vokenization: Improving Language Understanding with Contextualized,
Visual-Grounded Supervision,
by Hao Tan and Mohit Bansal - Integrating Multimodal Information in Large Pretrained Transformers,
by Wasifur Rahman, Md. Kamrul Hasan, Sangwu Lee, AmirAli Bagher Zadeh, Chengfeng Mao, Louis-Philippe Morency and Mohammed E. Hoque - VL-BERT: Pre-training of Generic Visual-Linguistic Representations,
by Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei and Jifeng Dai - VisualBERT: A Simple and Performant Baseline for Vision and Language,
by Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh and Kai-Wei Chang - ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations
for Vision-and-Language Tasks,
by Jiasen Lu, Dhruv Batra, Devi Parikh and Stefan Lee - VideoBERT: A Joint Model for Video and Language Representation Learning,
by Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy and Cordelia Schmid
- Prompting GPT-3 To Be Reliable,
by Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan L. Boyd-Graber and Lijuan Wang
- DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context
Tuning,
by Praveen Venkateswaran, Evelyn Duesterwald and Vatche Isahagian - Does GPT-3 Generate Empathetic Dialogues? A Novel In-Context Example
Selection Method and Automatic Evaluation Metric for Empathetic Dialogue
Generation,
by Young-Jun Lee, Chae-Gyun Lim and Ho-Jin Choi - Fusing Task-Oriented and Open-Domain Dialogues in Conversational Agents,
by Tom Young, Frank Xing, Vlad Pandelea, Jinjie Ni and Erik Cambria - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog,
by Baolin Peng, Michel Galley, Pengcheng He, Chris Brockett, Lars Liden, Elnaz Nouri, Zhou Yu, Bill Dolan et al. - Mind the Knowledge Gap: A Survey of Knowledge-enhanced Dialogue
Systems,
by Sagi Shaier, Lawrence Hunter and Katharina Kann - Dialogue State Tracking with a Language Model using Schema-Driven
Prompting,
by Chia-Hsuan Lee, Hao Cheng and Mari Ostendorf - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems,
by Andrea Madotto, Zhaojiang Lin, Genta Indra Winata and Pascale Fung - Action-Based Conversations Dataset: A Corpus for Building More In-Depth
Task-Oriented Dialogue Systems,
by Derek Chen, Howard Chen, Yi Yang, Alexander Lin and Zhou Yu - Fine-grained Post-training for Improving Retrieval-based Dialogue
Systems,
by Janghoon Han, Taesuk Hong, Byoungjae Kim, Youngjoong Ko and Jungyun Seo - Recent Advances in Deep Learning Based Dialogue Systems: A Systematic
Survey,
by Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, Vinay Adiga and Erik Cambria - Slot Self-Attentive Dialogue State Tracking,
by Fanghua Ye, Jarana Manotumruksa, Qiang Zhang, Shenghui Li and Emine Yilmaz - Pretraining the Noisy Channel Model for Task-Oriented Dialogue,
by Qi Liu, Lei Yu, Laura Rimell and Phil Blunsom - UBAR: Towards Fully End-to-End Task-Oriented Dialog System with
GPT-2,
by Yunyi Yang, Yunhao Li and Xiaojun Quan - End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using
GPT-2,
by DongHoon Ham, Jeong-Gwan Lee, Youngsoo Jang and Kee-Eung Kim - A Simple Language Model for Task-Oriented Dialogue,
by Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu, Semih Yavuz and Richard Socher
- A Survey on Knowledge Graph-Based Recommender Systems,
by Qingyu Guo, Fuzhen Zhuang, Chuan Qin, Hengshu Zhu, Xing Xie, Hui Xiong and Qing He - Are Graph Augmentations Necessary?: Simple Graph Contrastive Learning
for Recommendation,
by Junliang Yu, Hongzhi Yin, Xin Xia, Tong Chen, Lizhen Cui and Quoc Viet Hung Nguyen - Disentangled Representations Learning for Multi-Target Cross-Domain Recommendation,
by Guo, Xiaobo, Li, Shaoshuai, Guo, Naicheng, Cao, Jiangxia, Liu, Xiaolei, Ma, Qiongxu, Gan, Runsheng and Zhao, Yunan - Rethinking Reinforcement Learning for Recommendation: A Prompt Perspective,
by Xin Xin, Tiago Pimentel, Alexandros Karatzoglou, Pengjie Ren, Konstantina Christakopoulou and Zhaochun Ren - Advances and Challenges in Conversational Recommender Systems: A
Survey,
by Chongming Gao, Wenqiang Lei, Xiangnan He, Maarten de Rijke and Tat-Seng Chua - Pre-training Graph Transformer with Multimodal Side Information for
Recommendation,
by Yong Liu, Susen Yang, Chenyi Lei, Guoxin Wang, Haihong Tang, Juyong Zhang, Aixin Sun and Chunyan Miao - Towards Hands-Free Visual Dialog Interactive Recommendation,
by Tong Yu, Yilin Shen and Hongxia Jin
- Word-Label Alignment for Event Detection: A New Perspective via
Optimal Transport,
by Amir Pouran Ben Veyseh and Thien Huu Nguyen - Learning Cross-Task Dependencies for Joint Extraction of Entities,
Events, Event Arguments, and Relations,
by Minh Van Nguyen, Bonan Min, Franck Dernoncourt and Thien Nguyen - CLIP-Event: Connecting Text and Images with Event Structures,
by Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji et al. - Event Causality Identification via Derivative Prompt Joint Learning,
by Shirong Shen, Heng Zhou, Tongtong Wu and Guilin Qi - Augmenting Open-Domain Event Detection with Synthetic Data from GPT-2,
by Amir Pouran Ben Veyseh, Minh Van Nguyen, Bonan Min and Thien Huu Nguyen - SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup,
by Rongzhi Zhang, Yue Yu and Chao Zhang - Exploring Pre-trained Language Models for Event Extraction and Generation,
by Sen Yang, Dawei Feng, Linbo Qiao, Zhigang Kan and Dongsheng Li
- Learning Cross-Task Dependencies for Joint Extraction of Entities,
Events, Event Arguments, and Relations,
by Minh Van Nguyen, Bonan Min, Franck Dernoncourt and Thien Nguyen - Selecting Optimal Context Sentences for Event-Event Relation Extraction,
by Hieu Man, Nghia Trung Ngo, Linh Ngo Van and Thien Huu Nguyen - Multilingual SubEvent Relation Extraction: A Novel Dataset and Structure
Induction Method,
by Viet Dac Lai, Hieu Man, Linh Ngo Van, Franck Dernoncourt and Thien Nguyen - Salience-Aware Event Chain Modeling for Narrative Understanding,
by Xiyang Zhang, Muhao Chen and Jonathan May - Joint Constrained Learning for Event-Event Relation Extraction,
by Haoyu Wang, Muhao Chen, Hongming Zhang and Dan Roth
- SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup,
by Rongzhi Zhang, Yue Yu and Chao Zhang
- Toward General Design Principles for Generative AI Applications,
by Justin D. Weisz, Michael J. Muller, Jessica He and Stephanie Houde - Unsupervised Representation Learning from Pre-trained Diffusion Probabilistic
Models,
by Zijian Zhang, Zhou Zhao and Zhijie Lin - Parsing as Pretraining,
by David Vilares, Michalina Strzyz, Anders S\ogaard and Carlos G'omez-Rodr'\iguez - Unsupervised Deep Learning via Affinity Diffusion,
by Jiabo Huang, Qi Dong, Shaogang Gong and Xiatian Zhu - Learning to detect unseen object classes by between-class attribute
transfer,
by Christoph H. Lampert, Hannes Nickisch and Stefan Harmeling
-
Awesome-ChatGPT,
ChatGPT资料汇总学习,持续更新......
-
Awesome ChatGPT Prompts,
In this repository, you will find a variety of prompts that can be used with ChatGPT.
-
ChatRWKV,
ChatRWKV is like ChatGPT but powered by my RWKV (100% RNN) language model, which is the only RNN (as of now) that can match transformers in quality and scaling, while being faster and saves VRAM. Training sponsored by Stability EleutherAI.
-
ChatGPT-Hub,
ChatGPT资源汇总
-
PaLM-rlhf-pytorch,
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture.
-
BAAI-WuDao/Data,
“悟道”项目构建了高质量的数据集,用于支撑大模型的训练和测评工作,本仓库提供所有开源数据集的链接。
-
Colossal-AI,
Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart distributed training and inference in a few lines.
-
Exploring Prompt Injection Attacks,
by Jose Selvi
Prompt Injection is a new vulnerability that is affecting some AI/ML models and, in particular, certain types of language models using prompt-based learning.
-
ChatGPT发展历程、原理、技术架构详解和产业未来,
by 陈巍
本文将介绍ChatGPT的特点、功能、技术架构、局限、产业应用、投资机会和未来。作者本人曾担任华为系自然语言处理( NLP )企业的首席科学家。
-
How does GPT Obtain its Ability?,
by Yao Fu
Tracing emergent abilities of language models to their sources.
-
Open source solution replicates ChatGPT training process,
Colossal-AI, as one of the hottest open-source solutions for large AI models, presents an open-source low-cost ChatGPT equivalent implementation process.
- CPM-Bee,
CPM-Bee是一个开源的双语预训练语言模型,参数量为10B,拥有十余种原生能力和强大的通用语言能力,并支持结构化输入和输出。
-
Awesome-ChatGPT,
ChatGPT资料汇总学习,持续更新......
-
Awesome ChatGPT Prompts,
In this repository, you will find a variety of prompts that can be used with ChatGPT.
-
ChatRWKV,
ChatRWKV is like ChatGPT but powered by my RWKV (100% RNN) language model, which is the only RNN (as of now) that can match transformers in quality and scaling, while being faster and saves VRAM. Training sponsored by Stability EleutherAI.
-
ChatGPT-Hub,
ChatGPT资源汇总
-
PaLM-rlhf-pytorch,
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture.
-
BAAI-WuDao/Data,
“悟道”项目构建了高质量的数据集,用于支撑大模型的训练和测评工作,本仓库提供所有开源数据集的链接。
-
Colossal-AI,
Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart distributed training and inference in a few lines.
-
Exploring Prompt Injection Attacks,
by Jose Selvi
Prompt Injection is a new vulnerability that is affecting some AI/ML models and, in particular, certain types of language models using prompt-based learning.
-
ChatGPT发展历程、原理、技术架构详解和产业未来,
by 陈巍
本文将介绍ChatGPT的特点、功能、技术架构、局限、产业应用、投资机会和未来。作者本人曾担任华为系自然语言处理( NLP )企业的首席科学家。
-
How does GPT Obtain its Ability?,
by Yao Fu
Tracing emergent abilities of language models to their sources.
-
Open source solution replicates ChatGPT training process,
Colossal-AI, as one of the hottest open-source solutions for large AI models, presents an open-source low-cost ChatGPT equivalent implementation process.
- CPM-Bee,
CPM-Bee是一个开源的双语预训练语言模型,参数量为10B,拥有十余种原生能力和强大的通用语言能力,并支持结构化输入和输出。
Knowledge Science and Engineering Lab is recruiting researchers! You are welcome to apply for the following positions:
- Research Assistant: Bachelor degree or above, proficient in Python/Java, familiar with machine learning espicially deep learning models.
- Postdoctoral Fellow: Doctoral research in Artificial Intelligence, published at least 3 high-quality papers.
- Lecturer, Associate Professor and Professor
If you are interested in our research and meet the above requirements, feel free to contact Prof. Guilin Qi.
知识科学与工程实验室正在招聘科研人员!欢迎申请以下岗位:
- 科研助理:本科学历以上,精通Python/Java,熟悉机器学习,特别是深度学习模型。
- 博士后:博士研究人工智能相关方向,发表至少3篇高水平论文。
- 讲师、副教授、教授等教职
如果您对我们的研究工作感兴趣并满足以上要求,欢迎您与漆桂林教授联系。