Build a New Tagging App for Collecting Volunteer Readings
Opened this issue · 0 comments
High-Level Characterization Warning:
This is a high-level characterization. If you plan to start working on this, please contact @yairl or @yanirmr for further details and coordination.
Is your feature request related to a problem? Please describe.
The current system focuses on transcribing given audio. There is a need for a new tagging app that collects recordings from volunteers reading provided texts, which can extend the variety of texts and speakers in the dataset.
Describe the solution you'd like
Develop a new tagging app, either based on a Telegram bot or a web interface, to collect recordings of volunteers reading texts. The app should handle the following requirements:
1. Text Curation:
- Curate texts with appropriate licenses for volunteer reading.
- Ensure the texts are diverse and suitable for linguistic analysis.
2. Volunteer Interaction:
- Provide an interface for volunteers to receive and read the curated texts.
- Allow volunteers to submit their recordings easily.
3. Data Storage:
- Store the collected recordings in a structured and secure manner.
- Ensure proper metadata tagging (e.g., text ID, volunteer ID, timestamp).
4. Verification and Quality Control:
- Implement a verification process to ensure the accuracy and quality of the recordings.
- Conduct automated and/or manual checks for audio clarity, correctness of the read text, and proper metadata tagging.