- Lark API is a speech assessment REST API built using NextJS in Typescript.
- It provides accuracy scores, speech to text transcription, and the projected IELTS pronunciation band.
- It allows English learning apps and websites to assess and provide real-time feedback on the users’ pronunciation.
- Lark utilizes the Wav2Vec2 model from Meta for analyzing the speech sample.
- It converts the speech to it’s phonetic transcription (S2P) using zero-shot cross-lingual recognition.
- After recognizing the phonetics of the speech, it compares it with the ideal pronunciation of the transcribed speech using the Jaro-Winkler string similarity algorithm.
- The API is written completely in NextJS using next-pages routing.
- I have used next-auth for user authentication via GitHub and maintaining/persisting sessions.
- I used Redis for rate-limiting the API based on the IP of the call.
- The Frontend is written using NextJS in Typescript.
- I opted for TailwindCSS as the CSS framework for this project.
- For the tables and icons, Material UI has been used.
- I used Prisma ORM on top of a PlanetScale database which is a serverless MySQL DB.
- Here is the UML Diagram for the database: