/idear

Handsfree Audio Development Interface for IntelliJ IDEA.

Primary LanguageJavaApache License 2.0Apache-2.0

idear

A general purpose voice user interface for the IntelliJ Platform, inspired by Tavis Rudd. Possible use cases: visually impaired and RSI users. For background information, check out this presentation.

Speech Recognition

ASR is supported by CMU Sphinx and Amazon Lex. All recognition is offline by default.

Speech-to-Text

Whether Lex manages to resolve and fulfill an intent or not, it will still return the recognised utterance in text (unless it did not hear anything at all). LexASR and CMUSphinxASR provide a method waitForUtterance() which blocks until the speech to text service returns a string.

NLP - Text to Action

If Lex does manage to resolve and fulfill (to the point where it delegates to client-side fulfillment) an intent by invoking a Lamba function then LexRecognizer notifies a NlpResultListener that the request has been fulfilled or failed etc.

NlpProvider defines a method processUtterance() which takes a string utterance and context. LexNlp implements NlpProvider and notifies the NlpResultListener.

Text-to-Speech

TTS is supported by MaryTTS and Amazon Polly. Speech synthesis is offline by default.

Roadmap

Idear is currently a work in progress. These are some of the features we have implemented and are currently working on:

Activation

  • User presses button or activates voice control by saying something, “Okay __, help me.”
  • “Hello , welcome to the handsfree audio development interface for IntelliJ IDEA.”
  • “There are a number of commands you can use, for example ‘Open settings’, ‘Find action’, ‘Open file’...”

Visually Impaired Mode

  • Action reader. When user enables a flag, any selecting menu options or actions read back to user.
  • Status updates. User says, “Run application”. Plugin responds, “building project”, “compiling application”, “running project”.
  • Text selection. Plugin reads back selected region (rapidly).
  • User says, "Where am I?". Plugin responds, "You are inside method X, on line Y".

Interactive Features

  • User says, “open Analyze”. Plugin responds, “Would you like to ‘Inspect Code’, ‘Code Cleanup’...”
  • User says, “open tip of the day”. Plugin responds, “Did you know that... ”
  • User says, “activate intentions”. Plugin responds, “Would you like to ‘Invert if condition’, ‘Remove braces’,...”

IDE Features

  • Understand numbers (one, two , three, four, five, six…)
    • Jump to text inside the editor window
    • Goto line numbers
  • Understand free form language
    • Finding text in the editor
    • Performing arbitrary actions
  • Menus (open + file, edit, view, navigate, code, analyze, refactor, build, run, tools, version control)
  • Navigation keys (“Page Up”, “Page down”, “line up”, “line down”, “go left”, “go right”)
  • Fixed actions (“extract method”, “expand selection”, “shrink selection”, “focus project”)

Code Features

  • Code generation (generate for-loop, getter, setter…)
  • Refactorings
    • Extract method
    • Extract parameter
  • Show intention actions
  • Auto-completion
  • Speech typing

Building

For Linux or Mac OS users:

git clone https://github.com/OpenASR/idear && cd idear && ./gradlew runIde

For Windows users:

git clone https://github.com/OpenASR/idear & cd idear & gradlew.bat runIde

Recognition works with most popular microphones (preferably 16kHz, 16-bit). For best results, minimize background noise.

Programming By Voice