/openai-quickstart

A comprehensive guide to understanding and implementing large language models with hands-on examples using LangChain for AIGC applications.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

OpenAI Quickstart


English | 中文

This project is designed as a one-stop learning resource for anyone interested in large language models and their application in Artificial Intelligence Governance and Control (AIGC) scenarios. By providing theoretical foundations, development basics, and hands-on examples, this project offers comprehensive guidance on these cutting-edge topics.

Features

  • Theory and Development Basics of Large Language Models: Deep dive into the inner workings of large language models like GPT-4, including their architecture, training methods, applications, and more.

  • AIGC Application Development with LangChain: Hands-on examples and tutorials using LangChain to develop AIGC applications, demonstrating the practical application of large language models.

Getting Started

You can start by cloning this repository to your local machine:

git clone https://github.com/DjangoPeng/openai-quickstart.git

Then navigate to the directory and follow the individual module instructions to get started.

Schedule

Date Description Course Materials Events
Mon Jul 12 Week 1 Fundamentals of Large Models: Evolution of Theory and Technology
- An Initial Exploration of Large Models: Origin and Development
- Warm-up: Decoding Attention Mechanism
- Milestone of Transformation: The Rise of Transformer
- Taking Different Paths: The Choices of GPT and Bert
Suggested Readings:
- Attention Mechanism: Neural Machine Translation by Jointly Learning to Align and Translate
- An Attentive Survey of Attention Models
- Transformer: Attention is All you Need
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
[Homework]
Thu Jul 16 The GPT Model Family: From Start to Present
- From GPT-1 to GPT-3.5: The Evolution
- ChatGPT: Where It Wins
- GPT-4: A New Beginning
Prompt Learning
- Chain-of-Thought (CoT): The Pioneering Work
- Self-Consistency: Multi-path Reasoning
- Tree-of-Thoughts (ToT): Continuing the Story
Suggested Readings:
- GPT-1: Improving Language Understanding by Generative Pre-training
- GPT-2: Language Models are Unsupervised Multitask Learners
- GPT-3: Language Models are Few-Shot Learners


Additional Readings:
- GPT-4: Architecture, Infrastructure, Training Dataset, Costs, Vision, MoE
- GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models
- Sparks of Artificial General Intelligence: Early experiments with GPT-4

[Homework]
Tue Jul 19 Week 2 Fundamentals of Large Model Development: OpenAI Embedding
- The Eve of General Artificial Intelligence
- "Three Worlds" and "Turing Test"
- Computer Data Representation
- Representation Learning and Embedding
Embeddings Dev 101
- Course Project: GitHub openai-quickstart
- Getting Started with OpenAI Embeddings
Suggested Readings:
- Representation Learning: A Review and New Perspectives
- Word2Vec: Efficient Estimation of Word Representations in Vector Space
- GloVe: Global Vectors for Word Representation

Additional Readings:

- Improving Distributional Similarity with Lessons Learned from Word Embeddings
- Evaluation methods for unsupervised word embeddings
[Homework]
Code:
[embedding]
Sat Jul 23 OpenAI Large Model Development and Application Practice
- OpenAI Large Model Development Guide
- Overview of OpenAI Language Models
- OpenAI GPT-4, GPT-3.5, GPT-3, Moderation
- OpenAI Token Billing and Calculation
OpenAI API Introduction and Practice
- OpenAI Models API
- OpenAI Completions API
- OpenAI Chat Completions API
- Completions vs Chat Completions
OpenAI Large Model Application Practice
- Initial Exploration of Text Completion
- Initial Exploration of Chatbots
Suggested Readings:

- OpenAI Models
- OpenAI Completions API
- OpenAI Chat Completions API
Code:
[models]
[tiktoken]

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated. If you have any suggestions or feature requests, please open an issue first to discuss what you would like to change.

Github

License

This project is licensed under the terms of the Apache-2.0 License . See the LICENSE file for details.

Contact

Django Peng - pjt73651@email.com

Project Link: https://github.com/DjangoPeng/openai-quickstart