A cross platform (Android/iOS/MacOS) Bahasa Indonesia children's speech recognizer library, written in Flutter and leveraging the Kaldi framework. The speech recognizer library reads a buffer from a microphone device and converts spoken words into text in near-instant inference time with high accuracy. This library is also extensible to your own custom speech recognition model!
!!! note
Since our built-in default model was trained on children's speech, it may perform poorly on adult's speech.
- Indonesian speech-to-text through a Kaldi-based automatic speech recognition (ASR) model, trained on children's speech.
- Train custom machine learning model with model extractor.
- Integrate speech-to-text model with mobile and desktop applications.
- Install Flutter SDK.
- Run
git lfs pull
command. - Install Visual Studio Code.
- Open the project in Visual Studio Code, navigate to
lib/main.dart
. - Launch an Android emulator or iOS simulator. Optionaly, you can also connect to a real device.
- Run the demo on Android/iOS/MacOS by going to the top navigation bar of VSCode, hit Run, then Start Debugging.
Note Kaldi libraries have been compiled from commit hash 9af2c5c16389e141f527ebde7ee432a0c1df9fb9
with OpenFST v1.7.3.
On Android, you will need to allow microphone permission in AndroidManifest.xml
like so:
<uses-feature android:name="android.hardware.microphone" android:required="false"/>
<uses-permission android:name="android.permission.RECORD_AUDIO"/>
Similarly on iOS/MacOS:
- Open Xcode
- Navigate to
Info.plist
- Add microphone permission
NSMicrophoneUsageDescription
. You can follow this guide.
- After setting up, run the app by pressing the
Load model
button and thenStart listening
- Speak into the microphone and the corresponding output text will be displayed in the text field.
- Press
Stop listening
to stop the app from listening.
import 'package:speech_recognizer/speech_recognizer.dart';
class _MyHomePageState implements SpeechListener { // (1)
final recognizer = SpeechController.shared;
void _load() async {
// ask for permission
final permissions = await SpeechController.shared.permissions(); // (2)
if (permissions == AudioSpeechPermission.undetermined) {
await SpeechController.shared.authorize();
}
if (await SpeechController.shared.permissions() !=
AudioSpeechPermission.authorized) {
return;
}
if (!_isInitialized) {
await SpeechController.shared.initSpeech('id'); // (3)
setState(() {
_isInitialized = true;
});
SpeechController.shared.addListener(this); // (4)
}
}
@override
void onResult(Map result, bool wasEndpoint) { // (5)
List<List<String>> candidates = result.containsKey('partial') // (6)
? [result['partial'].trim().split(' ')]
: result['alternatives']
.map((x) => x['text'].trim().split(' ').cast<String>().toList())
.toList()
.cast<List<String>>();
if (candidates.isEmpty ||
!candidates
.any((element) => element.any((element) => element.isNotEmpty))) {
return;
}
}
}
- Setup listener by implements
SpeechListener
in your class. - Ask for recording permission.
- Initialize Indonesian recognizer model.
- Register listener in this class.
- Output text listener while speaking.
- Normalized result.
Platform | Code | Function |
---|---|---|
Flutter | speech_recognizer.dart |
Interface API to communicate with native platform (Android/iOS/Mac). There are many speech recognizer methods, check lib/main.dart to know how to use them. |
All Platforms | model-id-id |
Speech model shared for all platforms. Replace model-id-id/graph to change the model dictionary. |
iOS/MacOS | SpeechController.swift |
Native platform channel for speech recognizer on iOS/MacOS. It uses Vosk with custom model. |
Android | SpeechController.kt |
Native platform channel for speech recognizer on android. It uses Vosk with custom model. |
- Follow Installation / Setup guide
- Launch an Android emulator or iOS simulator
- Run
flutter test integration_test/app_test.dart