OoriData/OgbujiPT

Get a proper understanding of all the langchain bits we're using, and remove it as a requirement

Closed this issue · 5 comments

Trying to peel back some layers of abstraction, and langchain.chains.question_answering.load_qa_chain, and possibly simplify the chat-your-docs pattern. It feels pretty straightforward, so reduced magic would be good.

Prepared this code for the investigation (qa_chain.py):

qa_chain.py
'''
Basically a debugger harness for a popular langchain.chains.question_answering.load_qa_chain
use-case
See: https://github.com/uogbuji/OgbujiPT/issues/11
'''

import sys
import pprint

from PyPDF2 import PdfReader
from langchain.vectorstores import Qdrant
from langchain.embeddings import SentenceTransformerEmbeddings

import langchain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.question_answering import load_qa_chain
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain import OpenAI  # Using the API, though on a self-hosted LLM

from ogbujipt.config import openai_emulation

langchain.verbose = False

# Replace with your LLM API hostname & port
HOST = 'http://s'
PORT = '8000'

QA_CHAIN_TYPE = 'stuff'
PDF_USER_QUESTION_PROMPT = 'Ask a question about your PDF:'
N_CTX = 2048
EMBED_CHUNK_SIZE = 500
EMBED_CHUNK_OVERLAP = 100

# https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
DOC_EMBEDDINGS_LLM = 'all-MiniLM-L6-v2'


def main():
  pdf_reader = PdfReader(sys.argv[1])

  # Collect text from pdf
  text = ''.join((page.extract_text() for page in pdf_reader.pages))

  # Split the text into chunks
  text_splitter = CharacterTextSplitter(
      separator='\n',
      chunk_size=EMBED_CHUNK_SIZE,
      chunk_overlap=EMBED_CHUNK_OVERLAP,
      length_function=len
  )
  chunks = text_splitter.split_text(text)

  # LLM will be downloaded from HuggingFace automatically
  embeddings = SentenceTransformerEmbeddings(model_name=DOC_EMBEDDINGS_LLM)

  # Create in-memory Qdrant instance for the embeddings
  knowledge_base = Qdrant.from_texts(
      chunks,
      embeddings,
      location=':memory:',
      collection_name='doc_chunks',
  )

  user_q = input(PDF_USER_QUESTION_PROMPT)

  # Return the "k" most relevant objects to the "user_question" as "docs"
  docs = knowledge_base.similarity_search(user_q, k=4)

  print('Top K docs')
  pprint.pprint(docs)

  callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
  openai_emulation(host=HOST, port=PORT)
  llm = OpenAI(
      temperature=0.1, callback_manager=callback_manager,
      verbose=True)
  chain = load_qa_chain(llm, chain_type=QA_CHAIN_TYPE)

  # Calculating prompt (takes time and can optionally be removed)
  prompt_len = chain.prompt_length(docs=docs, question=user_q)
  print(f'Prompt len: {prompt_len}')

  resp = chain.run(input_documents=docs, question=user_q)
  print('LLM Response:', resp)


if __name__ == "__main__":
  main()

PDF doc I used: http://www.hasbro.com/common/instruct/monins.pdf

User question was:

What happens when you pass go?

Ask a question about your PDF:What happens when you pass go?    
Top K docs
[Document(page_content='After you have completed your play, the turn passes to the left. The\ntokens remain on the spaces occupied and proceed from that point on\nthe player’s next turn. Two or more tokens may rest on the same space\nat the same time.\nAccording to the space your token reaches, you may be entitled to\nbuy real estate or other properties — or obliged to pay rent, pay taxes,\ndraw a Chance or Community Chest card,  “Go to Jail  ® ,” etc.\nIf you throw doubles, you move your token as usual, the sum of the', metadata={}),
 Document(page_content='The Bank never “goes broke.” If the Bank runs out of money, the\nBanker may issue as much more as may be needed by writing on any\nordinary paper.THE PLAY… Starting with the Banker, each player in turn throws the\ndice. The player with the highest total starts the play: Place your token\non the corner marked “GO,” throw the dice and move your token in\nthe direction of the arrow the number of spaces indicated by the dice.\nAfter you have completed your play, the turn passes to the left. The', metadata={}),
 Document(page_content='move ahead in the usual manner on your next turn.\nYou get out of Jail by…(1) throwing doubles on any of your next\nthree turns; if you succeed in doing this you immediately move\nforward the number of spaces shown by your doubles throw; even\nthough you had thrown doubles, you do not take another turn;\n(2) using the “Get Out of Jail Free” card if you have it; (3) purchasing\nthe “Get Out of Jail Free” card from another player and playing it;', metadata={}),
 Document(page_content='(3) you throw doubles three times in succession.\nWhen you are sent to Jail you cannot collect your $200 salary in that\nmove since, regardless of where your token is on the board, you must\nmove it directly into Jail. Yours turn ends when you are sent to Jail.\nIf you are not “sent” to Jail but in the ordinary course of play land\non that space, you are “Just Visiting,” you incur no penalty, and you\nmove ahead in the usual manner on your next turn.', metadata={})]

Looks like chains.llm.LLMChain.generate is the place to set breakpoints, after all the other towering stacks of setup.

When it finally runs (return self.llm.generate_prompt(…), the value of prompts is:

Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

After you have completed your play, the turn passes to the left. The
tokens remain on the spaces occupied and proceed from that point on
the player’s next turn. Two or more tokens may rest on the same space
at the same time.
According to the space your token reaches, you may be entitled to
buy real estate or other properties — or obliged to pay rent, pay taxes,
draw a Chance or Community Chest card,  “Go to Jail  ® ,” etc.
If you throw doubles, you move your token as usual, the sum of the

The Bank never “goes broke.” If the Bank runs out of money, the
Banker may issue as much more as may be needed by writing on any
ordinary paper.THE PLAY… Starting with the Banker, each player in turn throws the
dice. The player with the highest total starts the play: Place your token
on the corner marked “GO,” throw the dice and move your token in
the direction of the arrow the number of spaces indicated by the dice.
After you have completed your play, the turn passes to the left. The

move ahead in the usual manner on your next turn.
You get out of Jail by…(1) throwing doubles on any of your next
three turns; if you succeed in doing this you immediately move
forward the number of spaces shown by your doubles throw; even
though you had thrown doubles, you do not take another turn;
(2) using the “Get Out of Jail Free” card if you have it; (3) purchasing
the “Get Out of Jail Free” card from another player and playing it;

(3) you throw doubles three times in succession.
When you are sent to Jail you cannot collect your $200 salary in that
move since, regardless of where your token is on the board, you must
move it directly into Jail. Yours turn ends when you are sent to Jail.
If you are not “sent” to Jail but in the ordinary course of play land
on that space, you are “Just Visiting,” you incur no penalty, and you
move ahead in the usual manner on your next turn.

Question: What happens when you pass go?
Helpful Answer:

The return value is:

LLMResult(generations=[[Generation(text=' When you pass Go, you collect $200 from the Bank if it is your first time passing that space or you can choose to take a Chance card if you landed on that space after rolling doubles.', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'prompt_tokens': 543, 'completion_tokens': 44, 'total_tokens': 587}, 'model_name': 'text-davinci-003'}, run=RunInfo(run_id=UUID('65f5f0ee-0b83-4760-a952-1527ceec909b')))

After sleeping on this, I'm going to whip up a function, ogbujipt.prompting.basic_context_build, which basically does in a more transparent way the manipulations langchain is doing.

One important consideration here is g11n (doesn't look like LC has that in mind), and really, we need a separate ticket for that.

Checked in an demo/example of how you might use the new function, which also demonstrates how we can simplify the model_styles package (and maybe even eliminate it altogether). The vicuna_delimiters structure is temporarily thrown into the demo, but would instead be importable from ogbujipt.prompting. Maybe we define a ogbujipt.prompting.model_styles.py and have various delimiter convention structures there.

When using the OpenAI library directly there seems to be a situation were multiprocessing loses API parameters set in the main process. Something in the pickling or forking process seems to scramble things up, so that if you told it to use an OpenAI emulation LLM host, it forgets that and tries to phone corporate home. Implemented ogbujipt.async_helper.schedule_llm_call, openai_api_surrogate which allows you to send the API parameters along to the new process, where they can be re-applied.

Misclassified 075f197. Should go here. "Tweak to the case where the final split is empty. Reinstate test case."

Confirmation via:

pytest -k "test_split_poem" test/test_text_splitter.py -svv