SpeechGram: A C# repository from ndass6

A Microsoft Hololens application that aids users with hearing disabilities have conversations in everyday life. Using Microsoft Azure and the Hololens, our app performs three main functions:

Translates conversation speech into text, which is displayed to the disabled user
Allows the disabled user select one of many pre-defined responses, which is automatically spoken out loud
Identifies a single speaker (in a group of speakers) and tags each chat message with the speaker's name

This application was developed in a 3-week timeframe for the 2017's Microsoft Imagine Cup.

Build and Installation

We have packaged the app into a UWP appx package, but it requires a lot of work in order to sideload onto the HoloLens. We recommend that users just use the latest build found in the Build folder of the latest release and continue on with the instructions. Otherwise, users have two other options:

You can find the UWP appx package in the Speechgram_1.0.1.0_Test folder of the latest release found here. You will need to follow this guide to sideload the package onto the HoloLens
You can download the source code on this repo and follow the Build source code instructions below

Common instructions for building the source code or running the app from the latest build:

Install Visual Studio 2015 Update 3
Ensure that Tools and Windows 10 SDK is selected when installing Visual Studio
Install Unity 5.5.1f1
Insure that the HoloLens is connected to the Internet. This is not needed for building the app, but is needed during application use

Run the App from the Build

Download the latest build release of this app found here
Extract the zip to an empty folder
Open up the solution file (Speechgram.sln in the Build folder) in Microsoft Visual Studio
Connect the HoloLens to your computer. Make sure your laptop is authorized to access your device (you might be prompted to enter a pin on your computer, follow the instructions on the pop-up display)
Select Release, x86, Device for the three dropdowns in the top to deploy to the Microsoft Hololens

If the build fails, make sure to follow the messages in the console to fix the issues. Otherwise, just try building again :)

Build source code

These instructions are not needed if you use the latest build of the app (above).

Clone this repo to an empty folder
Create a folder named Build
Open up the project in Unity
Click on File -> Build & Run
Configure the following build settings:
- SDK = Windows 10
- Target Device = HoloLens
- UWP Build Type = D3D
- Build and Run on = Local Machine
- Check Copy References
- Check Unity C# Projects
Click Build and Run
Follow the instructions above in the Run the App from the Build File from step 3

![Alt text](Build Settings.png?raw=true "Title")

If the instructions above don't work:

Here are general instructions on how to install the tools.
Here are general instructions on how to build and run the project.

Usage Instructions

Before starting, make sure you understand what an Air-tap is.

As soon as it's launched, a chatroom opens up, and the application will start listening via microphone. It will wait for a speaker to start talking, and will record the entire time (max 30 seconds). As soon as the speaker is done talking, speech-to-text and Azure Voice Recognition is started. In about 5 seconds, both processes should finish. The message should appear in the chatroom (with the message tagged with the speaker's id).

A user can respond to messages by going to the quick-response menu. Air-tap anywhere on the chat room and the menu should open up. Some buttons will respond automatically, others are folders that contain similiar responses. The menu is color-coded like the following:

Red: App navigation buttons
Blue: Folders for other reply buttons
Green: Quick-response buttons that are said out loud using text-to-speech

To add a new speaker, have the speaker start talking. While talking, the user should go to the response menu and click on the Register New User button. Wait until the speaker is done talking, then registration should take place automatically.

For now, be wary of the following time limits:

A speaker must talk for at least 3 seconds to be identified
A speaker must talk for at least 10 seconds to be registered
Between speaker transitions, wait for at least 2 seconds

Technologies Used

Microsoft HoloLens: Platform for the application to run on
Unity Game Engine: Used to develop the HoloLens application
Microsoft Azure Cognitive Services Speaker Recognition API: Used to identify a single speaker for each message recorded
Microsoft Azure App Service: A broker-layer to connect to our database from the HoloLens
Microsoft Azure SQL Database: Stores persistent information about speakers, such as names and speaker Ids

Contact

Contact any one of the contributors for additional information!