This is a tool to help you do your qualitative data analysis. 🧐 This can for instance take your transcripts and generate codes and themes for you 💡. It summarizes your data and can help you get insights on your data. 📊
The Qualitative Data Analysis uses LLMs or Large Language Models to generate the summary / codes / themes and classify them. 🤖 The application is developped with Python.
This tool is powered by libraries:
- Streamlit: For the User Interface 🖥️
- Langchain: For creating LLMs applications 🔗
- OpenAI: The LLMs provider. For now we only integrated this LLM.
You need to have Python installed on your computer. Choose the latest version of Python 3. 🐍. The version tested is 3.8.10.
Rename the .streamlit/secrets_template.toml
file to .streamlit/secrets.toml
and edit it to add your own configuration about langchain, langsmith and openai api key.
Clone the repository and install the dependencies:
git clone
cd qualitative-data-analysis
pip install -r requirements.txt
streamlit run source/qualitative_analyse_agent.py
The usage is pretty simple. 🤓
- Upload your transcripts: You can upload your transcripts from the sidebar 📂.
- Generate transcripts summary: In the Raw data section, you can generate a summary of your data individually.
- Enter your research question: You can enter your research question. This will be used to generate codes and themes. ❓
- Generate codes and themes: You can now click on the button to generate codes and themes. This will generate codes and themes based on your research question. 💡
You can use langsmith to monitor your application and get insights on how it is used. 📊
Edit the .streamlit/secrets.toml
file and add the following lines:
[langsmith]
tracing = true
api_url = "https://api.smith.langchain.com"
api_key = "your key here"
project = "your project here"
TODO Show a diagram with the interaction between libs
TODO show the LLM Chaining
- Upload your transcripts 📂
- Generate Summary on all data or on a specific data 📊
- Based on a research question generate a summary of the data, generate codes and themes. ❓
- Update Qualitative Analysis Data parameters 🔄
- Generate a Qualitative Data Analysis report and download it 📄
- For now, the tool cannot perform a Qualitative Data Analysis on large datasets as the LLM used is limited to 16000 tokens. 🚫
- The data is not cached and the report as well. So if you reset the page the data will have to be uploaded and the report regenerated again. 🔄
- Upload voice transcripts and convert them to text and perform a Qualitative Data Analysis 🗣️
- Connect to Qualitative Data softwares 🤝
- (double check) Do intermediates checkings on the results to avoid LLM bias 🤔
- Perform map-refine summary on the data
- Handle large data transcripts
My name is Valentin Rudloff and I'm a Engineer. I make stuff in various fields. 👨🔧 For my wife's memoire, she needed a tool to help her do a Qualitative Data Analysis on transcripts she conducted. 📚 LLMs are really good at understanding human semantics and thus perform a Qualitative Data Analysis. 🧠 This application helped her get almost an instant result, and I'm pretty sure this can help you as well. 👍
The application and the LLM prompt are greatly inspired by Dr. Philip Adu, Ph.D video Master Qualitative Data Analysis with ChatGPT: An 18-Minute Guide.
Made with ❤️ by Valentin Rudloff If you want to help me create other stuff like this you can buy me ☕