Implement a chatbot augmented with customized database.
pip install -r requirements.txt
docker build -t llama-index-app .
docker run --rm -it llama-index-app:latest /bin/bash
- Modify
.env_example
by adding yourOPENAI_API_KEY
,ANTHROPIC_API_KEY
andLLAMA_CLOUD_API_KEY
. Change the file name to.env
- Add PDF files to ./data folder (feel free to keep/delete the existing files)
- Run base parser using
python cli.py base_parse data/[file_1.pdf] data/[file_n.pdf]
- Run llama parser using
python cli.py llama_parse data/[file_1.pdf] data/[file_n.pdf]
- We trace and evaluate the RAG application by using the Phoenix framework.
- Use URL
http://localhost:6006/
to get a decomposed view of the query engine operation. - Evaluation Standard: Retrieval quality; Generation quality; Faithfulness; Efficiency; Robustness
- Phoenix as the evaluation framework: use Chat GPT4 as a reference to decide on the correctness of RAG's answer
- How it works
- Intermediate result is also stored in folder
intermediate result
.
- File Loader
SimpleDirectoryReader
: able to adapt to different file types and select tools to extract the content based on the file extensions. For binary files such as PDF and Offices docs, it uses PyPDF and Pillow.
- Transformer
TokenTextSplitter
: Split the Document text into chunks that contain whole sentences.SummaryExtractor
: Generate Summaries of the text contained by the node. (claude-3-opus-20240229 is slow in this task)OpenAIEmbedding
: deault model trained to produce embeddings that effectively capture semantic meanings of the text.
- Build Index
VectorStoreIndex
: The workhorse in most RAG applications. Used for efficient querying because it allows for similarity searches over the embedded representations of the text.
- Similarity Search: When a query is made, the query text will be embedded and compared against the stored vectors using a similarity measure identified with cosine similarity. Cosine similarity measures the cosine of the angle between two vectors. The smaller the angle between them, the more similar they are.
- Local embedding: a well-balanced default model provided by Hugging Face.
TreeIndex
: Generated bt LLM. Useful for summarization. (not used in base parser)
- Store Index (TO-DO)
- Local store: persist and load from disk.
- Vector Database: Picone, CheomaDB, etc.
- Build QueryEngine QueryEngine is an interface that process natural language queries to generate rich responses. It often relies on one or more indexes through retrievers and can also be combined with other query engines for enhanced capabilities.
-
high-level API
-
low-level API typical steps in query process: retrieval, postprocessing, and response synthesis
- Retrievers: Browse an index and select the relevant nodes to build the necessary context. A retriever will return a
NodeWithScore
object - a structure that combines a node with an associated score. The retrival modes that can be selected is usually dependent on the index. In the case ofVectorStoreIndex
, we can choose retriever fromVectorIndexRetriever
andVectorIndexAutoRetriever
VectorIndexRetriever
: Default retriever that's used byVectorStoreIndex
. This retriever is able to narrow down the search scope of the retriever, sets query mode of the vector store, and return the nodes withtop_k
similarity. - Postprocessor: Adjust the retrieved context before that context gets injected into a prompt and sent to LLM for response synthesis. They operate by either filtering, transforming, or re-ranking nodes.p
- Synthesizer:
compact
mode: Similar torefine
mode, but with reduced number of required LLM queries.
- Retrievers: Browse an index and select the relevant nodes to build the necessary context. A retriever will return a
-
Q1 What is the formula used in basic building block The formula used in the basic building block involves a series of fully connected layers with RELU non-linearities, followed by linear projection layers. The operation of the first part of the block is described by the following equations:
- ( h'{1} = \text{RELU}(W'{1}x' + b'_{1}) )
- ( h'{2} = \text{RELU}(W'{2}h'{1} + b'{2}) )
- ( h'{3} = \text{RELU}(W'{3}h'{2} + b'{3}) )
- ( h'{4} = \text{RELU}(W'{4}h'{3} + b'{4}) )
- ( b = W'b(h'_{4}) )
- ( \theta_{f} = W'f(h'_{4}) )
These equations describe the transformations applied to the input ( x' ) through a series of layers to produce the final outputs.
-
Q2 what is the performance using Statistical method on M4 test sets?
The performance using the statistical method on the M4 test sets is summarized as follows:-
Yearly:
- OWA (Overall Weighted Average): 0.788
- M4 Rank: 8
-
Quarterly:
- OWA: 0.898
- M4 Rank: 8
-
Monthly:
- OWA: 0.905
- M4 Rank: 8
-
Others:
- OWA: 0.989
- M4 Rank: 8
-
Average:
- OWA: 0.861
These metrics indicate the performance of the statistical method across different frequencies of the M4 test sets.
-
-
Q3 How many Yearly/6 time series of Finance type are in the M4 dataset? There are 6,519 Yearly/6 time series of the Finance type in the M4 dataset.
-
Q4 What are the possible topologies for N-BEATS? The possible topologies for N-BEATS are:
- N-BEATS-DRESS
- PARALLEL
- NO-RESIDUAL
- LAST-FORWARD
- NO-RESIDUAL-LAST-FORWARD