Welcome to the "Large Language Models and ChatGPT in Three Weeks" code repository! In this repo, we dive deep into understanding and utilizing large language models, with a specific focus on OpenAI's GPT-3, GPT-3.5-turbo (ChatGPT), and GPT-4.
For more, check out my Expert Playlist!
The project is divided into the following sections:
-
Introduction to Large Language Models (LLMs): We start with a brief introduction to LLMs, understanding their architecture, strengths, and potential use-cases.
-
Working with ChatGPT: ChatGPT is a version of the GPT model fine-tuned for generating conversational responses. We'll learn how to make API calls to ChatGPT and interpret the responses.
-
Latency Evaluation: We analyze and compare the latency of API calls when using hosted API services versus running models on local compute resources. This helps in making informed decisions about where to run these powerful models.
-
Cost Calculation: The code includes methods to calculate the cost of API calls based on the token usage by different models.
-
Generating Responses with OpenAI Models: We use the OpenAI's
ChatCompletion
andCompletion
methods to generate responses from prompts.
The code in this project helps you to get hands-on experience with these powerful language models, and also gives insights about factors to consider when deciding to use these models, such as cost and latency.
- Familiarity with Python
- An OpenAI API key. You can obtain it by signing up on the OpenAI website.
- Familiarity with machine learning concepts and natural language processing would be helpful, but not mandatory.
- Clone this repository to your local machine.
- Install the required Python libraries using pip:
pip install -r requirements.txt
Ensure you have set the following environment variables with your API keys or tokens:
OPENAI_API_KEY: Your OpenAI API key.
COHERE_API_KEY: Your Cohere API key (if using Cohere's services).
HF_TOKEN: Your Hugging Face token (if using Hugging Face's services).
You're all set to explore the notebooks!
This project contains several Jupyter notebooks each focusing on a specific topic. You can find them in the notebooks
directory:
-
Intro to Prompt Engineering: This notebook introduces the concept of prompt engineering, a crucial aspect of effectively using language models.
-
Making Predictions: Here, we delve into the process of making predictions with large language models and interpret their responses.
-
Cost Projecting: This notebook focuses on understanding and calculating the costs involved in using large language models. It includes functions to calculate the cost of API calls based on the token usage by different models.
-
Use Cases: In this notebook, we explore various use cases for large language models, providing practical examples of how they can be used in different scenarios.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License. See the LICENSE file for details.