Background

The dataset from this repository comes from the "Diabetes Stories" project [1]. The transcripts are available to download from the website. This repository contains the same transcripts but converted from Word format to plain text. Non-ASCII characters were replaced with their ASCII equivalents.

Additionally, the themes folder contains the coded segments of the transcripts in JSON format. These were painstakingly put together for easy reference.

Copyrights of the data belong to the original authors.

Motivation

The motivation for assembling this dataset is to test out different tools that could be of use in qualitative research (e.g. LLMs?).

[1] https://diabetesmemories.com/