LMOps is a research initiative on fundamental research and technology for building AI products w/ foundation models, especially on the general technology for enabling AI capabilities w/ LLMs and Generative AI models.
- Better Prompts: Promptist, Extensible prompts
- Longer Context: Structured prompting, Length-Extrapolatable Transformers
- Knowledge Augmentation (TBA)
- Fundamentals: Understanding In-Context Learning
- microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
- microsoft/torchscale: Transformers at (any) Scale
- [Paper Release] Dec, 2022: Why Can GPT Learn In-Context? Language Models Secretly Perform Finetuning as Meta Optimizers
- [Paper&Model&Demo Release] Dec, 2022: Optimizing Prompts for Text-to-Image Generation
- [Paper&Code Release] Dec, 2022: Structured Prompting: Scaling In-Context Learning to 1,000 Examples
- [Paper Release] Nov, 2022: Extensible Prompts for Language Models
Advanced technologies facilitating prompting language models.
[Paper] Optimizing Prompts for Text-to-Image Generation
- Language models serve as a prompt interface that optimizes user input into model-preferred prompts.
- Learn a language model for automatic prompt optimization via reinforcement learning.
[Paper] Structured Prompting: Scaling In-Context Learning to 1,000 Examples
- Example use cases:
- Prepend (many) retrieved (long) documents as context in GPT.
- Scale in-context learning to many demonstration examples.
[Paper] Extensible Prompts for Language Models
- Extensible interface allowing prompting LLMs beyond natural language for fine-grain specifications
- Context-guided imaginary word learning for general usability
[Paper] Why Can GPT Learn In-Context? Language Models Secretly Perform Finetuning as Meta Optimizers
- According to the demonstration examples, GPT produces meta gradients for In-Context Learning (ICL) through forward computation. ICL works by applying these meta gradients to the model through attention.
- The meta optimization process of ICL shares a dual view with finetuning that explicitly updates the model parameters with back-propagated gradients.
- We can translate optimization algorithms (such as SGD with Momentum) to their corresponding Transformer architectures.
Hiring: aka.ms/nlpagi
We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on Foundation Models (aka large-scale pre-trained models) and AGI, NLP, MT, Speech, Document AI and Multimodal AI, please send your resume to fuwei@microsoft.com.
This project is licensed under the license found in the LICENSE file in the root directory of this source tree.
Microsoft Open Source Code of Conduct
For help or issues using the pre-trained models, please submit a GitHub issue.
For other communications, please contact Furu Wei (fuwei@microsoft.com
).