NanoChain is a Python library designed to provide a suite of useful tools for text processing, vector indexing, and utilizing OpenAI's GPT-3.5-turbo model to perform various tasks.
- GPT API Completion
- Summary buffer memory for conversation summarization
- Text file and PDF file loader
- Text splitting for large texts
- Vector indexing using the ChromaDB library
- ReActAgent for performing tasks using available tools
To install:
pip install nanochain
In order to use the NanoChain library, import it into your Python script as shown below:
import nanochain
# Set up OpenAI Key
export OPENAI_API_KEY='sk-...'
# or
import openai
openai.api_key = "sk-..."
The get_API_Response function is used to interact with the GPT-3.5-turbo API. It sends a prompt and returns the response from the model.
Input:
- prompt (string): The message for which you want to generate a response.
- system_prompt (string, optional): The prompt to use as the system text in the conversation history. Default is "You are a helpful assistant".
- stop (string, optional): A sequence of characters at which generation should stop.
Output:
- response (string): The generated response from GPT-3.5-Turbo.
Example usage:
response = nanochain.get_API_Response("What is the capital of France?")
print(response)
The SummaryBufferMemory class is used to store and summarize a conversation. It can be initialized with an optional word_limit parameter.
When using with GPT, you mainly pass the buffer_messages_string() and summary to GPT as the prompt.
Methods/Attributes:
- 'buffer_messages_string()' returns a string representation of all messages currently in the buffer.
- 'summarize_messages()' generates a summary of all messages in the buffer and returns it as a string. The summary is generated using an API call that summarizes conversation history in 150 words or less.
- 'add_message(role, message)' adds a new message to the buffer. The 'role' parameter is a string representing who sent the message (either "AI" or "human"), and 'message' is a string representing the content of the message.
- 'delete_last_message()' removes the last message added to the buffer.
- 'count_words(text)' counts the number of words in a given text string.
- all_messages: A list of all messages.
- summary: A string containing the summary of the conversation so far.
Example usage:
# Create a new SummaryBufferMemory object with a word limit of 500
sbm = SummaryBufferMemory(word_limit=500)
# Add some messages to the buffer
sbm.add_message("AI", "Hello, how can I help you today?")
sbm.add_message("human", "I'm having trouble with my email.")
sbm.add_message("AI", "What seems to be the problem?")
# Generate a summary of the conversation so far
summary = sbm.summarize_messages()
print(summary)
# Delete the last message added to the buffer
sbm.delete_last_message()
# Add another message
sbm.add_message("human", "I keep getting an error message when I try to send an email.")
# Check the current contents of the buffer
buffer_str = sbm.buffer_messages_string()
print(buffer_str)
The load_files_from_path function is used to load all the pdf and txt file in a directory or a single pdf/txt file. The function returns a list of tuples (file_name, content_string).
Example usage:
pdf_text = nanochain.load_files_from_path("document.pdf")
print(pdf_text[0][1])
txt_text = nanochain.load_files_from_path("document.txt")
print(txt_text[0][1])
The text_splitter function is used to split a large text into smaller sections, optionally with a specified overlap between sections.
Example usage:
sections = nanochain.text_splitter(text, n_words=500, overlap=50)
print(sections)
The vectorIndex class is used for indexing and querying text documents using the ChromaDB library.
Methods/Attributes:
- vectorIndex(documents): Initializes a new vector index object with the given documents. The documents parameter is a list of strings, where each string represents a document to be added to the index.
- add_documents(documents): Adds new documents to the index. The documents parameter is a list of strings, where each string represents a document to be added.
- query_documents(query_string, n_results=3): Executes a query against the index and returns the top n matching results as a dictionary. The query_string parameter is a string representing the query to execute, and n_results is an optional integer that specifies the maximum number of results to return (default value is 3).
- delete_documents(ids): Deletes the documents with the specified ids from the index. The ids parameter is a list of strings, where each string represents the id of a document to be deleted.
- persist(): saves the database to a local directory.
Example usage:
vector_index = nanochain.vectorIndex(documents=["Document 1", "Document 2"])
vector_index.add_documents(documents=["Document 3", "Document 4"])
results = vector_index.query_documents(query_string="Search query")
print(results)
The ReActAgent class is designed to perform tasks using available tools. You can add or remove tools, and run the agent with a given task string.
Methods/Attributes:
- ReActAgent(verbose=False, max_steps=10): Initializes a new ReActAgent object. The verbose parameter is an optional boolean that determines whether to print out the thought/action/action input/observation for each step (default is False). The max_steps parameter is an optional integer that specifies the maximum number of steps allowed before the agent stops (default is 10).
- add_tool(tool): Adds a new tool to the list of available tools. The tool parameter is a tuple with three elements: a string representing the name of the tool, a string representing the description of the tool, and a callable function that represents the implementation of the tool.
- remove_tool(tool_name): Removes a tool from the list of available tools by its name.
- get_action(task_string): Generates a string representation of the next action to take based on the current task. The task_string parameter is a string representing the input task/question.
- parse_and_execute(action_string): Parses an action string into its corresponding thought, action, action input, and observation, and executes the appropriate tool function. The action_string parameter is a string representing the action to take.
- run(task_string): Executes the entire ReAct process, taking steps until a final answer is generated or the maximum number of steps is reached. The task_string parameter is a string representing the input task/question.
Example usage:
agent = nanochain.ReActAgent(verbose=True, max_steps=10)
agent.add_tool(("New tool", "New tool description", function_name))
agent.remove_tool("New tool")
task_string = "Find the capital of France."
result = agent.run(task_string)
print(result)
The get_recursive_summary function is used to generate a summary of a text using a recursive summarization algorithm.
Example usage:
summary = nanochain.get_recursive_summary(text)