Playing around with langchain.
I use a couple of windows with a standard command (this all assumes you are in the project folder):
# Follow the application log
less +F data/logs/app.log
# Follow the
less +F ~/.ollama/logs/server.log
To run a script I use the module syntax: python -m rag.simple_chat
.
- Vectorestores are now in config file
- Config of a configuration contains also the model and the directory with the data to ingest
- Use a
- Direct logging into a file so it doesn't interfere with my dialog in the terminal.
- Folllowing this tutorial from langchain
- Trying to ingest the A12 documentation, ~ 3'800 markdown docs.
- Trying to ingest with llama2:latest. Takes long:
- 3456/19666 [30:02<2:23:58, 1.88it/s]
- Expected time > 3 hours
- With default embedding model ()
- 1224/19666 [10:28<2:46:50, 1.84it/s]
- Expected time > 3 hours
- With embedding model nomic-embed-text
- 15 minutes! š„°
- Which model to use for local embeddings with Ollama? In this blog entry (April 8, 2024) they give an overview.
- For having iTerm layouts with commands so I get back my screen setup: iTomate
- For a later version: Building a Confluence Q&A App with LangChain and ChatGPT. Hint: There is a ConfluenceLoader in LangChain!!
- While looking for a fast embedding model: Local Embedding done right - Medium