A blog post about AWS AugmentAbility is now online! To learn more, check this link.
AWS AugmentAbility is a mobile web app which showcases 5 AWS AI services (Amazon Transcribe, Amazon Translate, Amazon Polly, Amazon Rekognition and Amazon Textract) and, at the same time, provides features that may benefit people with a visual or communication impairment, including difficulties in reading written text (text recognition), hearing (live transcription), speaking (text-to-speech), or having a conversation in a foreign language (voice-to-voice live translation).
- Live transcription & text-to-speech: the app transcribes conversations and speeches for you, in real-time. Can't speak? Type what you want to say, and the app will say it for you. This feature is currently available in Chinese, English, French, German, Italian, Japanese, Korean, Portuguese and Spanish.
- Live transcription & text-to-speech with translation: the app transcribes and translates conversations and speeches for you, in real-time. Can't speak? Type what you want to say, and the app will translate and say it for you. Translation currently available in 75+ languages.
- Real-time Conversation Translation: select a target language, speak in your own language, and the app will translate what you say in the target language. This feature is currently available in Chinese, English, French, German, Italian, Japanese, Korean, Portuguese and Spanish.
- Object detection: take a picture with your smartphone, and the app will describe the objects around you.
- Text recognition for labels & signs: point your camera at any label, sign or small chunk of text, and the app will read it out loud for you. AugmentAbility can also translate the text into 75+ languages, or make it more readable for users with dyslexia by leveraging the OpenDyslexic font.
- Text extraction from documents: point your camera at any full-page document, and the app will read it out loud for you. AugmentAbility can also translate the text into 75+ languages, or make it more readable for users with dyslexia by leveraging the OpenDyslexic font.
- Live transcription & text-to-speech and Real-time Conversation Translation features are currently available in Chinese, English, French, German, Italian, Japanese, Korean, Portuguese and Spanish.
- The Live transcription & text-to-speech with translation feature is currently available in the following 75 languages supported by Amazon Translate: Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Bengali, Bosnian, Bulgarian, Chinese (Simplified), Catalan, Chinese (Traditional), Croatian, Czech, Danish, Dari, Dutch, English, Estonian, Finnish, French, French Canadian, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Malayalam, Maltese, Mongolian, Marathi, Norwegian, Farsi (Persian), Pashto, Polish, Portuguese, Portuguese Portugal , Punjabi, Romanian, Russian, Serbian, Sinhala, Slovak, Slovenian, Somali, Spanish, Spanish Mexican, Swahili, Swedish, Filipino Tagalog, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, and Welsh.
- Object detection and Text recognition for labels & signs features are currently available in Arabic, English, French, German, Italian, Portuguese, Russian and Spanish.
- The Text extraction from documents feature is currently available in English, French, German, Italian, Portuguese, Russian and Spanish.
Follow the steps described in this blog post
- Follow Step 1 and 2 (Create the Amazon Cognito user pool and identity pool, and grant permissions for accessing AWS AI services; Clone the GitHub repository and edit the configuration file) from this blog post
- run
npm install
(only first time) - run
npm install --global local-web-server
(only first time) - run
npm run-script build
(only first time, or in case of changes to JavaScript code) - run
ws
- Before accessing the app for the first time, you have to set a new password for the user that has been automatically created during Step 1. You can find the link to the temporary login screen in the Outputs tab for the CloudFormation stack (field UserPoolLoginUrl). For this first sign-in, you use the user name you set up and the temporary password you received via email. After you set your new password, you’re ready to test the mobile web app by opening the index.html file in a browser.
amazon-archives/amazon-transcribe-websocket-static * (Apache 2.0 License), ziniman/amazon-transcribe-websocket-static * (Apache 2.0 License), aws-sdk (Apache 2.0 License), bensonruan/webcam-easy * (MIT License), department-stockholm/aws-signature-v4 * (MIT License), jquery/jquery (MIT License), browserify/browserify (MIT License), lwsjs/local-web-server (MIT License), microphone-stream/microphone-stream (MIT License), sindresorhus/query-string (MIT License), babel/babel (MIT License), babel/babelify (MIT License), Semantic-Org/Semantic-UI (MIT License), uikit/uikit (MIT License), shoelace-style/shoelace (MIT License), Font Awesome icons (CC BY 4.0 License), Twemoji icons (CC BY 4.0 License), Lordicon free icons (CC BY ND 4.0 License), terser/terser (BSD license).
* In accordance with its license, this package was subject to some modifications (edited files available in the "lib" and "style" directories)
This sample is provided for demonstration purposes only. Customers are responsible for making their own independent assessment of the information in this document and any use of AWS products or services, each of which is provided "as is" without warranty of any kind, whether express or implied.
AWS AugmentAbility is licensed under Apache License Version 2.0.