A general purpose voice user interface for the IntelliJ Platform, inspired by Tavis Rudd. Possible use cases: visually impaired and RSI users. For background information, check out this presentation.
Idear is currently a work in progress. These are some of the features we are currently working on:
ASR is supported by Amazon Lex and CMU Sphinx.
Whether Lex manages to resolve and fulfill an intent or not, it will still return the recognised utterance in text (unless it did not hear anything at all).
LexASR
and
CMUSphinxASR
provide a method waitForUtterance()
which blocks until the speech to text service returns a string.
If Lex does manage to resolve and fulfill (to the point where it delegates to client-side fulfillment) an intent by
invoking a Lamba function then LexRecognizer
notifies a NlpResultListener
that the the request has been fulfilled or failed etc.
NlpProvider
defines a method processUtterance()
which takes a string utterance and context.
LexNlp
implements NlpProvider
and notifies the NlpResultListener
.
TTS is supported by Amazon Polly and MaryTTS.
- User presses button or activates voice control by saying something, “Okay __, help me.”
- “Hello , welcome to the handsfree audio development interface for IntelliJ IDEA.”
- “There are a number of commands you can use, for example ‘Open settings’, ‘Find action’, ‘Open file’...”
- Action reader. When user enables a flag, any selecting menu options or actions read back to user.
- Status updates. User says, “Run application”. Plugin responds, “building project”, “compiling application”, “running project”.
- Text selection. Plugin reads back selected region (rapidly).
- User says, "Where am I?". Plugin responds, "You are inside method X, on line Y".
- User says, “open Analyze”. Plugin responds, “Would you like to ‘Inspect Code’, ‘Code Cleanup’...”
- User says, “open tip of the day”. Plugin responds, “Did you know that... ”
- User says, “activate intentions”. Plugin responds, “Would you like to ‘Invert if condition’, ‘Remove braces’,...”
- Understand numbers (one, two , three, four, five, six…)
- Jump to text inside the editor window
- Goto line numbers
- Understand free form language
- Finding text in the editor
- Performing arbitrary actions
- Menus (open + file, edit, view, navigate, code, analyze, refactor, build, run, tools, version control)
- Navigation keys (“Page Up”, “Page down”, “line up”, “line down”, “go left”, “go right”)
- Fixed actions (“extract method”, “expand selection”, “shrink selection”, “focus project”)
- Code generation (generate for-loop, getter, setter…)
- Refactorings
- Extract method
- Extract parameter
- Show intention actions
- Auto-completion
- Speech typing
Run git clone https://github.com/OpenASR/idear && cd idear && ./gradlew runIde
.
Recognition works with most popular microphones (preferably 16kHz, 16-bit). For best results, minimize background noise.