NanoChain Library Documentation

NanoChain is a Python library designed to provide a suite of useful tools for text processing, vector indexing, and utilizing OpenAI's GPT-3.5-turbo model to perform various tasks.

Features

GPT API Completion
Summary buffer memory for conversation summarization
Text file and PDF file loader
Text splitting for large texts
Vector indexing using the ChromaDB library
ReActAgent for performing tasks using available tools

Installation

To install:

pip install nanochain

Usage

In order to use the NanoChain library, import it into your Python script as shown below:

import nanochain

# Set up OpenAI Key
export OPENAI_API_KEY='sk-...'
# or
import openai
openai.api_key = "sk-..."

GPT API Completion

The get_API_Response function is used to interact with the GPT-3.5-turbo API. It sends a prompt and returns the response from the model.

Input:

prompt (string): The message for which you want to generate a response.
system_prompt (string, optional): The prompt to use as the system text in the conversation history. Default is "You are a helpful assistant".
stop (string, optional): A sequence of characters at which generation should stop.

Output:

response (string): The generated response from GPT-3.5-Turbo.

Example usage:

response = nanochain.get_API_Response("What is the capital of France?")
print(response)

SummaryBufferMemory

The SummaryBufferMemory class is used to store and summarize a conversation. It can be initialized with an optional word_limit parameter.

When using with GPT, you mainly pass the buffer_messages_string() and summary to GPT as the prompt.

Methods/Attributes:

'buffer_messages_string()' returns a string representation of all messages currently in the buffer.
'summarize_messages()' generates a summary of all messages in the buffer and returns it as a string. The summary is generated using an API call that summarizes conversation history in 150 words or less.
'add_message(role, message)' adds a new message to the buffer. The 'role' parameter is a string representing who sent the message (either "AI" or "human"), and 'message' is a string representing the content of the message.
'delete_last_message()' removes the last message added to the buffer.
'count_words(text)' counts the number of words in a given text string.
all_messages: A list of all messages.
summary: A string containing the summary of the conversation so far.

Example usage:

# Create a new SummaryBufferMemory object with a word limit of 500
sbm = SummaryBufferMemory(word_limit=500)

# Add some messages to the buffer
sbm.add_message("AI", "Hello, how can I help you today?")
sbm.add_message("human", "I'm having trouble with my email.")
sbm.add_message("AI", "What seems to be the problem?")

# Generate a summary of the conversation so far
summary = sbm.summarize_messages()
print(summary)

# Delete the last message added to the buffer
sbm.delete_last_message()

# Add another message
sbm.add_message("human", "I keep getting an error message when I try to send an email.")

# Check the current contents of the buffer
buffer_str = sbm.buffer_messages_string()
print(buffer_str)

File loader

The load_files_from_path function is used to load all the pdf and txt file in a directory or a single pdf/txt file. The function returns a list of tuples (file_name, content_string).

Example usage:

pdf_text = nanochain.load_files_from_path("document.pdf")
print(pdf_text[0][1])

txt_text = nanochain.load_files_from_path("document.txt")
print(txt_text[0][1])

Text splitting

The text_splitter function is used to split a large text into smaller sections, optionally with a specified overlap between sections.

Example usage:

sections = nanochain.text_splitter(text, n_words=500, overlap=50)
print(sections)

Vector Database

The vectorIndex class is used for indexing and querying text documents using the ChromaDB library.

Methods/Attributes:

vectorIndex(documents): Initializes a new vector index object with the given documents. The documents parameter is a list of strings, where each string represents a document to be added to the index.
add_documents(documents): Adds new documents to the index. The documents parameter is a list of strings, where each string represents a document to be added.
query_documents(query_string, n_results=3): Executes a query against the index and returns the top n matching results as a dictionary. The query_string parameter is a string representing the query to execute, and n_results is an optional integer that specifies the maximum number of results to return (default value is 3).
delete_documents(ids): Deletes the documents with the specified ids from the index. The ids parameter is a list of strings, where each string represents the id of a document to be deleted.
persist(): saves the database to a local directory.

Example usage:

vector_index = nanochain.vectorIndex(documents=["Document 1", "Document 2"])
vector_index.add_documents(documents=["Document 3", "Document 4"])
results = vector_index.query_documents(query_string="Search query")
print(results)

ReActAgent

The ReActAgent class is designed to perform tasks using available tools. You can add or remove tools, and run the agent with a given task string.

Methods/Attributes:

ReActAgent(verbose=False, max_steps=10): Initializes a new ReActAgent object. The verbose parameter is an optional boolean that determines whether to print out the thought/action/action input/observation for each step (default is False). The max_steps parameter is an optional integer that specifies the maximum number of steps allowed before the agent stops (default is 10).
add_tool(tool): Adds a new tool to the list of available tools. The tool parameter is a tuple with three elements: a string representing the name of the tool, a string representing the description of the tool, and a callable function that represents the implementation of the tool.
remove_tool(tool_name): Removes a tool from the list of available tools by its name.
get_action(task_string): Generates a string representation of the next action to take based on the current task. The task_string parameter is a string representing the input task/question.
parse_and_execute(action_string): Parses an action string into its corresponding thought, action, action input, and observation, and executes the appropriate tool function. The action_string parameter is a string representing the action to take.
run(task_string): Executes the entire ReAct process, taking steps until a final answer is generated or the maximum number of steps is reached. The task_string parameter is a string representing the input task/question.

Example usage:

agent = nanochain.ReActAgent(verbose=True, max_steps=10)
agent.add_tool(("New tool", "New tool description", function_name))
agent.remove_tool("New tool")

task_string = "Find the capital of France."
result = agent.run(task_string)
print(result)

Recursive Summary

The get_recursive_summary function is used to generate a summary of a text using a recursive summarization algorithm.

Example usage:

summary = nanochain.get_recursive_summary(text)

Troyanovsky/nano_chain

NanoChain Library Documentation

Features

Installation

Usage

GPT API Completion

SummaryBufferMemory

File loader

Text splitting

Vector Database

ReActAgent

Recursive Summary