Speechful

Voice editing made easy

A speech-based document editing tool intened for those who cannot use keyboards.

Project Aims
Overview
Functionality
Technologies & Frameworks
Contribute
File Structure
License

Project Aims

Writing and editing papers, documents, and emails is an essential task for any modern day student. Yet, the way in which we do so can be inhibitive for some. While keyboards and mice are incredibly useful for most, for those that are missing limbs, hands, digits, or have conditions such as Arthritis in the hand, Parkinson’s, Carpal Tunnel Syndrome, or Essential Tremor, keyboards are practically unusable if not extremely discomforting. The number of Americans that belong to this group is estimated to be over 28 million.

This problem is amplified by remote and hybrid education. Prior to the pandemic, students had access to disability services, where they could take tests and write papers with the help of university transcribers. However, with the transition to remote learning, these students must now rely on imperfect hacks such as sending audio files, painfully using a keyboard, or avoiding typing altogether.

We wanted to take this opportunity to develop a tool that would make it easy for students with such conditions to participate in the classroom (and potentially employees in the workplace) without access to disability services.

Overview

Speechful is a document editing tool that uses your voice as the primary interface between you and your computer. From start to finish, you can create, edit, format, reorder, and export your documents just like you would on MS Word or Google Docs without ever touching a keyboard or mouse.

Speechful is intended to be a desktop application that allows you to write up an essay, an email, or complete a written test by converting your voice into context-aware instructions. By clearly indexing every paragraph and sentence visually, giving voice instructions has never been easier. Once you open up a document, you can simply say a command such as "start typing" or "delete this from paragraph 2" followed by what you would like to type or delete. The supported commands are described below. Once you finish typing, you can tell Speechful to add punctation after a certain word, move your cursor to another paragraph, and most importantly, change words that were misunderstood.

For HackThis, we made a MVP that runs in Chrome to serve as a proof of concept. Here are some screenshots of the MVP:

We also made a business pitch for HackThis, which can be found here: Slides & Transcript

Here is a demonstration of the current product: Youtube

Functionality

Currently supported voice functionality:

Planned functionality:

Change size - "Change size of paragraph (index)"
Change color - "Change color of paragraph (index)"
Make above paragraph functions into sentence functions
Move Speechful to a container such as Electron
Export a speechful document into common file types
Create a tutorial for new users

Technologies & Frameworks

The front-end of this application is built with React. For natural language processing, we are using Google Speech. Design element dependancies include: Material-UI and FontAwesome.

Contribute

In order to set up the project for contribution, run:

git clone https://github.com/virnarula/speechful.git to clone this repository
cd speechful to enter the /speechful directory.
npm install to install all the dependencies of the project
npm run start to launch the development server.
If it doesn't happen automatically, open localhost:3000 in your browser.
Voila!

This repo uses the Google JavaScript Style Guide.

File Structure

This project is set up like a traditional react project. This is a high-level overview of the file strucutre. Trivial files and directories will be omitted for simplicity.

speechful
├──src
|    ├── components      # Contains screens and their components
|    ├── data            # Where documents are saved 
|    ├── IO              # Contains IO functionality
|    ├── model           # Document data model representations
|    ├── res             # Image resources
|    ├── speech          # Contains speech objects to decipher instructures
|    └── App.js          # Contains React Routes
└── public
     ├── index.html      # Bare-bones website 
     └── main.js         # Contains start-up code

License

This project is under the MIT Liscense.

virnarula/speechful