Pet project / 2nd Capstone project for DataTalks.Club LLM ZoomCamp`24:
RAG application based on interview preparation questions for Data Engineers, Machine Learning Engineers.
Project can be tested and deployed in cloud virtual machine (AWS, Azure, GCP), GitHub CodeSpaces (the easiest option, and free), or even locally with/without GPU! Works with Ollama and ChatGPT.
For GitHub CodeSpace option you don't need to use anything extra at all - just your favorite web browser + GitHub account is totally enough.
- I made an Exam Preparation Assistant as the 1st LLM RAG app project
- Then I successfully passed my Google Cloud Professional Data Engineer exam
- What's the next step? Right - Interview Preparation!
At some moment of life we want a new job, right? So I decided to make my 2nd LLM project that helps Data and ML Engineers prepare for a job interview. Why? Because just technical knowledge even validated by professional certification(s) is not enough to get a job. Usually interview questions have a bit different angle than certification exams. In addition, interviews include soft skills assessment and behavioral questions.
With right prompts ChatGPT can be a decent interview assistant. Still, it can hallucinate as not trained well for specific topics yet. Or you want to use your local Ollama (like me), but open models know even less?
That's where RAG comes in! RAG is Retrieval Augmented Generation - the process of optimizing the output of a large language model (LLM). It references your prepared knowledge base before generating a response. So instead of asking LLM about interview topics "from scratch", RAG based assistant first gets context from your knowledge base (like QnA records) and then provides you better focused answers. This is what I use in this project.
Just imagine, you can 'talk to your data', amazing!β¨
This is my 2nd LLM project started during LLM ZoomCamp'24.
LLM Interview Assistant should assist users with their preparation to data/ML job interviews. It should provide a chatbot-like interface to easily find interview related information without looking through guides or websites.
As in the 1st project, I'm making it universal enough, so knowledge base can be extended to other professions, of course with some adjustments in RAG prompts.
Thanks to LLM ZoomCamp for the reason to keep learning new cool tools!
I collected questions with answers from sources like articles on respected Data professions related websites, and transformed them into a dataset. Now it includes 2 .csv files - with Data Engineering and Machine Learning topics. And I plan to load some more QnAs universal for Data related positions. I prepared app UI and backend extensible.
Structure: id, question, text (=answer), position, section.
Sections are used in search and help to focus on specific topics of the interview.
- Frontend:
- UI: Streamlit web application for conversational interface
- Monitoring: Grafana
- Backend:
- Python 3.11/3.12
- Docker and docker-compose for containerization
- Elastic search to index interview questions-answers bank
- OpenAI-compatible API, that supports working with Ollama locally, even without GPU
- Ollama tested with Microsoft Phi 3/3.5 and Alibaba qwen2.5:3b models, they performed better than Google Flan-T5, Gemma 2
- you can pull and test any model from Ollama library
- with your own OPENAI_API_KEY you can choose gpt-3.5/gpt-4o
- PostgreSQL to store asked questions, answers, evaluation (relevance) and user feedback
- Setup environment
- Start the app
- Interact with the app
- Monitoring
- Retrieval evaluation
- Best practices
- Fork this repo on GitHub. Or use
git clone https://github.com/dmytrovoytko/llm-interview-assistant.git
command to clone it locally, thencd llm-interview-assistant
. - Create GitHub CodeSpace from the repo
βΌοΈ use 4-core - 16GB RAM machine type. - Start CodeSpace
- Go to the app directory
cd interview_assistant
- The app works in docker containers, you don't need to install packages locally to test it.
- Only if you want to develop the project locally, you can run
pip install -r requirements.txt
(project tested on python 3.11/3.12). - If you want to use gpt-3.5/gpt-4 API you need to correct OPENAI_API_KEY in
.env
file, which contains all configuration settings. - By default instructions (below) scripts will load Ollama/phi3.5 model. If you want to use also Ollama/phi3 or Ollama/qwen2.5:3b uncomment a line in
ollama_pull.sh
. Similarly you can load other Ollama models.
- Run
bash deploy.sh
to start all containers, including elasticsearch, ollama, postgres, streamlit, grafana. It takes at least couple of minutes to download/build corresponding images, then get all services ready to serve. So you can make yourself some tea/coffee meanwhile. When the new log messages stop appering, press enter to return to a command line.
- Run
bash init_db_es.sh
- to create PostgreSQL tables:
- to ingest and index question database:
- Run
bash ollama_pull.sh
to pull phi3/phi3.5 Ollama models
If you want to use other models, you can modify this script accordingly, then update app.py
to add your model names.
- Finally, open streamlit app: switch to ports tab and click on link with port 8501 (π icon).
-
Set query parameters - choose position, model, query parameters (search type - text, vector; response length - small, medium, long), enter your question.
-
Press 'Find the answer' button, wait for the response. For Ollama Phi3/qwen2.5 in CodeSpace response time was around a minute.
-
RAG evaluation: check relevance evaluated by LLM (default model to use for this is defined in
.env
file). -
Give your feedback by pressing corresponding number of stars πππππ
- 1-2 are negative
- 4-5 are positive
Both types of evaluation are stored in the database and can be monitored.
- App starts in wide mode by default. You can switch it off in streamlit settings (upper right corner).
You can monitor app performance in Grafana dashboard
-
As with streamlit, switch to the PORTS tab and click on the link with port 3000 (π icon). After loading Grafana use default credentials:
- Login: "admin"
- Password: "admin"
- Click 'Dashboards' in the left pane and choose 'Interview preparation assistant'.
- Check out app performance
Run docker compose down
in command line to stop all services.
Don't forget to remove downloaded images if you experimented with project locally! Use docker images
to list all images and docker image rm ...
to remove those you don't need anymore.
Notebooks with text only and vector search retrieval evaluation are in notebooks directory.
I decided to experiment with different Ollama models to create ground truth data for data Engineering QnA. To be honest, the results are poor enough IMHO. I think partially it is because of QnA knowledge base is not so specific as with exam dataset, as interview recommendations are quite vague.
Anyway, with some models (like Llama3.2 it) I just didn't manage to get proper response in JSON parsable format. Surprisingly, Gemma 2 worked well enough with it. The results are in ground-truth-data.csv
file. Generated questions are not very relevant to actually be a ground truth IMHO.
Nevertheless, the purpose is to learn, not to get a perfect result. So I tested min_search and Elastic search.
MinSearch:
- hit_rate 0.772, MRR 0.661
ElasticSearch:
-
text only: hit_rate 0.592, MRR 0.420
-
vector_knn: hit_rate 0.705, MRR 0.581
-
text_vector_knn: hit_rate 0.725, MRR 0.615
-
question_text_vector_knn: hit_rate 0.744, MRR 0.633
-
vector_combined_knn: hit_rate 0.747, MRR 0.635
I will continue experimenting with weights and boosting.
- Hybrid search: combining both text and vector search (Elastic search, encoding)
- User query rewriting
I plan to:
- add more questions to the knowledge database, add other positions
- test more models (like Llama 3.2)
- fine tune prompts
- experiment with weights and boost to improve retrieval metrics
Stay tuned!
π Thank you for your attention and time!
- If you experience any issue while following this instruction (or something left unclear), please add it to Issues, I'll be glad to help/fix. And your feedback, questions & suggestions are welcome as well!
- Feel free to fork and submit pull requests.
If you find this project helpful, please βοΈstarβοΈ my repo https://github.com/dmytrovoytko/llm-interview-assistant to help other people discover it π
Made with β€οΈ in Ukraine πΊπ¦ Dmytro Voytko