/Sawtpedia

A generated QR code for a monument that once scanned by mobile allows the user to listen to the Wikipedia page related to that monument in the mobile phone's language.

Primary LanguagePythonMIT LicenseMIT

Sawtpedia

This is a joint development project of Wikimedia Tunisia and Data Engineering and Semantics Research Unit within the framework of Hack4OpenGLAM. Based on inspiration from the logic of QRpedia, Yamen Bousrih has first presented the idea at the Hack4OpenGLAM Showcase at the 2021 Creative Commons Global Summit. Then, he has disseminated it in Wikimedia Conferences such as WikidataCon 2021. Deployed at https://sawtpedia.toolforge.org, Sawtpedia generates a QRCode related to a monument that once scanned will fetch the Wikidata item for that monument and then open the audio file for the Wikipedia article about the monument in the mobile device's language if available in Wikimedia Commons. If the audio recording does not exist, Sawtpedia will try to generate an audio from the lead of the Wikipedia article in the user language using gTTS Text-to-Speech System.

Principles

The tool uses the same principle as QRpedia. However, we have updated the project approach by considering the latest advances in Web Development and in Wikimedia Projects. In fact, the tool is implemented in Python with Flask instead of PHP and benefits from the large-scale multilingual structured data available in Wikidata to work. The tool has two components:

  • A HTML Page with advanced JavaScript and CSS codes to generate a QRCode for a given monument. The input is the Wikipedia Page of the monument in any language. The Wikidata item of the monument is retrieved from the Wikipedia Page using JavaScript and mw.config. Then, a QRCode will be generated using the QRpedia web interface leading to a web service leading to the audio recording of the Wikipedia article about the monument in the language of the web browser of the mobile device.
  • A Web Service implemented in Python with Flask to redirect the user to the audio recording of the Wikipedia article about the monument in the language of the web browser of the mobile device. The input here is the Wikidata ID of the monument. The Web Service will retrieve the language of the web browser of the user. Then, it will find the URL of the audio recording in the considered language using a SPARQL query on spoken text audio statements of Wikidata. Here, Wikidata hub is used to return the Wikidata ID of the user language based on its IETF Language Tag. If the file exists, the user will be redirected to the audio. If it does not exist, the tool can:
    • convert the lead of the Wikipedia article about the Wikidata item in the user language to an audio using gTTS. The languages currently supported by gTTS are: Afrikaans (af), Arabic (ar), Bulgarian (bg), Bengali (bn), Bosnian (bs), Catalan (ca), Czech (cs), Welsh (cy), Danish (da), German (de), Greek (el), English (en), Esperanto (eo), Spanish (es), Estonian (et), Finnish (fi), French (fr), Gujarati (gu), Hindi (hi), Croatian (hr), Hungarian (hu), Armenian (hy), Indonesian (id), Icelandic (is), Italian (it), Japanese (ja), Javanese (jw), Khmer (km), Kannada (kn), Korean (ko), Latin (la), Latvian (lv), Macedonian (mk), Malayalam (ml), Marathi (mr), Myanmar Burmese (my), Nepali (ne), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Sinhala (si), Slovak (sk), Albanian (sq), Serbian (sr), Sundanese (su), Swedish (sv), Swahili (sw), Tamil (ta), Telugu (te), Thai (th), Filipino (tl), Turkish (tr), Ukrainian (uk), Urdu (ur), Vietnamese (vi), and Chinese (zh).
    • generate an error message if gTTS does not support the user language.

Requirements

Team

Acknowledgements

  • Capacity Building about Web Development with Flask has been provided by Data Engineering and Semantics Research Unit, University of Sfax, Tunisia as a part of the Federated Research Project PRF-COV19-D1-P1.
  • We thank Terence Eden and Roger Bamkin for providing the source codes of QRpedia. We were inspired by the QRpedia Principles and we have even reused several excerpts as well as the QRpedia web service for the generation of the QRCode from URL in our source codes. As we built Sawtpedia based on QRpedia, we use the MIT License for our source code and we adopt the Website Privacy Policy of Wikimedia UK for our tool.
  • We thank Legoktm, Mutante, AntiComposite, Reedy, RhinosF1, and Bryan Davis for supporting the deployment of the tool on Toolforge using SSH Server.
  • We thank Habib M'henni from Wikimedia Tunisia for his contribution to our testing of the tool.
  • We thank Abel Lifaeli Mbula for his contributions to the source code.