LangChain Experiments

This repository focuses on experimenting with the LangChain library for building powerful applications with large language models (LLMs). By leveraging state-of-the-art language models like OpenAI's GPT-3.5 Turbo (and soon GPT-4), this project showcases how to create a searchable database from a YouTube video transcript, perform similarity search queries using the FAISS library, and respond to user questions with relevant and precise information.

LangChain is a comprehensive framework designed for developing applications powered by language models. It goes beyond merely calling an LLM via an API, as the most advanced and differentiated applications are also data-aware and agentic, enabling language models to connect with other data sources and interact with their environment. The LangChain framework is specifically built to address these principles.

LangChain

The Python-specific portion of LangChain's documentation covers several main modules, each providing examples, how-to guides, reference docs, and conceptual guides. These modules include:

  1. Models: Various model types and model integrations supported by LangChain.
  2. Prompts: Prompt management, optimization, and serialization.
  3. Memory: State persistence between chain or agent calls, including a standard memory interface, memory implementations, and examples of chains and agents utilizing memory.
  4. Indexes: Combining LLMs with custom text data to enhance their capabilities.
  5. Chains: Sequences of calls, either to an LLM or a different utility, with a standard interface, integrations, and end-to-end chain examples.
  6. Agents: LLMs that make decisions about actions, observe the results, and repeat the process until completion, with a standard interface, agent selection, and end-to-end agent examples.

Use Cases

With LangChain, developers can create various applications, such as customer support chatbots, automated content generators, data analysis tools, and intelligent search engines. These applications can help businesses streamline their workflows, reduce manual labor, and improve customer experiences.

Service

By selling LangChain-based applications as a service to businesses, you can provide tailored solutions to meet their specific needs. For instance, companies can benefit from customizable chatbots that handle customer inquiries, personalized content creation tools for marketing, or internal data analysis systems that harness the power of LLMs to extract valuable insights. The possibilities are vast, and LangChain's flexible framework makes it the ideal choice for developing and deploying advanced language model applications in diverse industries.

Requirements

OpenAI API Models

The OpenAI API is powered by a diverse set of models with different capabilities and price points. You can also make limited customizations to our original base models for your specific use case with fine-tuning.

Installation

1. Clone the repository

git clone https://github.com/your-username/langchain-experiments.git

2. Create a Python environment

Python 3.6 or higher using venv or conda. Using venv:

cd langchain-experiments
python3 -m venv env
source env/bin/activate

Using conda:

cd langchain-experiments
conda create -n langchain-env python=3.8
conda activate langchain-env

3. Install the required dependencies

pip install -r requirements.txt

4. Set up the keys in a .env file

First, create a .env file in the root directory of the project. Inside the file, add your OpenAI API key:

OPENAI_API_KEY=your_api_key_here

Save the file and close it. In your Python script or Jupyter notebook, load the .env file using the following code:

from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())

By using the right naming convention for the environment variable, you don't have to manually store the key in a separate variable and pass it to the function. The library or package that requires the API key will automatically recognize the OPENAI_API_KEY environment variable and use its value.

When needed, you can access the OPENAI_API_KEY as an environment variable:

import os
api_key = os.environ['OPENAI_API_KEY']

Now your Python environment is set up, and you can proceed with running the experiments.

Datalumina

This document is provided to you by Datalumina. We help data analysts, engineers, and scientists launch and scale a successful freelance business — $100k+ /year, fun projects, happy clients. If you want to learn more about what we do, you can visit our website and subscribe to our newsletter. Feel free to share this document with your data friends and colleagues.

Tutorials

For video tutorials on how to use the LangChain library and run experiments, visit the YouTube channel: youtube.com/@daveebbelaar