Transcriber is a web app using Google speech-to-text API for transcribing audio files. Transcoding, transcription and database is handled by Cloud functions and Firebase, while React JS is used for the web frontend.
- Generate upload url:
curl -H "Content-Type: audio/*" \
- Upload an audio file
curl -X PUT --data-binary @$FILE_PATH \
-H "Content-Type: audio/*" \
- Transcribe:
curl https://<region>-<project-id>
- Export transcriptions:
curl https://<region>-<project-id>
- Create a Firebase project
- Turn on the Firestore database and Storage. -- TODO how to do this? Prefer cli/.sh -- TODO enable security for generating uploadUrl
- Copy [.firebaserc_sample] to .firebaserc
- Edit
with the name of your Firebase project. - Install the Firebase CLI:
npm install -g firebase-tools
- Use the default bucket in Firebase Storage, or create a new one TODO how to define default bucket in config
set up environment variables with the name of the bucket you just created, along with your Google Analytics account ID:
> firebase functions:config:set \"name-of-bucket" \
- Enable the Google Speech API.
cd functions firebase functions:config:set
Create a .env
file in the test
folder with the following attributes:
FIREBASE_UPLOADS_BUCKET = name-of-uploads-bucket
FIREBASE_TRANSCODED_BUCKET = name-of-transcoded-bucket
> firebase functions:config:set \
analytics.account_id="UA-XXXXXX-XX" \"name-of-bucket" \
sendgrid.apikey="api key" \"" \"Your name" \
TODO From which directory?
cd functions
npm run deploy
firebase deploy
TODO eg curl https://<region>-<project-id>
TODO Use Swagger curl https://<region>-<project-id>
Exceptions are logged.
- cd1: Language codes
- cd2: Original MIME type
- cd3: Industry NAICS code of audio
- cd4: Interaction type
- cd5: Microphone distance
- cd6: Original media type
- cd7: Recording device name
- cd8: Recording device type
- cm1: Number of audio topic words
- cm2: Number of speech contexts phrases
- cm3: Audio duration
- cm4: Number of words
- cm5: Transcoding duration
- cm6: Transcribing duration
- cm7: Saving duration
- cm8: Process duration (transcoding + transcribing + saving)
- cm9: Confidence
Category | Action | Label | Value |
transcription | transcoded | transcript id | |
transcription | transcribed | transcript id | |
transcription | saved | transcript id | |
transcription | done | transcript id | audio duration |
api | authorization | idtoken | token |
api | authorization | customtoken | token |
- transcription → transcoding
- transcription → transcribing
- transcription → saving
Category | Action | Label |
transcript | exported | [type] |
transcript | deleted | step |
transcription done | transcript id |