👂 An RxJS operator for real-time speech-to-text (STT/S2T) streaming using the AWS Transcribe.
npm i @rxtk/stt-aws
yarn add @rxtk/stt-aws
~/.aws
.
Stream audio speech data to AWS Transcribe via WebSocket and get transcripts back:
import {map} from 'rxjs/operators';
import {toAWSTranscribe} from '@rxtk/stt-aws';
// The pipeline can take a stream of audio chunks encoded as
// LINEAR16 (PCM encoded as 16-bit integers) in the form of a Buffer
const stt$ = pcmChunkEncodedAs16BitIntegers$.pipe(
map(chunk => Buffer.from(chunk, 'base64')),
toAWSTranscribe()
);
stt$.subscribe(console.log); // log transcript output
stt$.error$.subscribe(console.error) // handle WebSocket errors
⚠️ Pay attention to the endcoding of the audio data. The operator only accepts PCM data encoded as 16-bit integers. For example, LINEAR16 encoding usually works.