This sample demonstrates how to transcribe/live-subtitle HTTP Live Streaming (HLS) media. The project is developed for .NET Framework 4.7.1, but should be compatible with versions >= 4.6. The audio extraction and format detection is implemented using NAudio library. The transciption is done with Bing Speech API.
- Open the HLSSample.sln solution file with Visual Studio
- Select the start-up project (
HSLConsoleTest.NETFramework
orHSLWPFTest.NETFramework
): Right-click the project name in Solution Explorer and select Set as StartUp Project - Insert the playlist URL (
.m3u8
file) as the value ofPlaylistUrl
string constant:- To Program.cs file in the console app
- To MainWindow.xaml.cs file in the WPF app
- Insert the Bing Speech API subscription key to enable transcription (can be omitted but so will then be the transcription)
- Run and enjoy
The solution consists of three projects:
HSLTools.NETFramework is a class library project containing the main functionality whilst the two other (demo apps) serve as examples on how to use the aforementioned library.
The HSLTools class library implements the following features:
- Loading and parsing HTTP Live Stream playlists (
.m3u8
files) - Downloading
.ts
files into memory - Extracting audio from
.ts
files utilizing NAudio library - Saving binary files on local disk
- Transcribing audio (bytes) with Bing Speech API
The main class of the library is HLSProcessor. See the two demo application projects (HLSConsoleTest and HLSWPFTest) to learn how to use the class library.
HLSConsoleTest processes the given playlist and displays the audio transcription in the console window. HLSWPFTest plays and displays the video files in the playlist with subtitles, which are produced by transcribing the audio in the media.
This is not a production-ready piece of code, but rather a proof-of-concept. Stuff missing/to consider:
- Universal Windows applications are not supported - please support/contribute to the awesome NAudio project in order to enable UWP compatibility.
- The media chunks are processed as they come (via the playlist). Thus, if/when the audio in
the chunk is terminated in the middle of a word, the transcription is incomplete
- "TODO" item here: Refactor/break the chunks in pieces based on the silent bits in the audio
- The MediaElement
in WPF applications does not support HTTP Live Stream out-of-the-box - the quick and dirty
approach taken here is to save the
.ts
files onto local disk and to feed theMediaElement
the local URIs.
This project was one of the outcomes of a short hackfest and was developed by the following fantastic team of people (in alphabetical order):