CharlotteJackson/DC_Crash_Bot

Write SQL code to extract the audio transcriptions and match them to Pulsepoint dispatches

Closed this issue · 3 comments

What is the Task

Turn the information in the audio transcription json files into a relational table and match the records to Pulsepoint dispatches

Why do we want to do this

so we can classify 911 calls about car crashes as pedestrian or non-pedestrian without manually listening to the scanner audio

How can I get started?

How can we start this task?

Definition of Done

when there's a script running on EC2 every few minutes that loads the jsons and extracts them into a table on the source_data schema, and the Pulsepoint analysis table has a step where incidents are matched to dispatch transcriptions

extracting the audio into a table is done, now need to match it to pulsepoint data with a reasonable degree of accuracy

Notes

  • Need sample recordings and csvs to play with from the same time
  • Might need NLP? string matching, phonetics match?

DONE!!!!