Caption Reader

This React Native project features a video player and an auto scrolling caption viewer that also display their English translations. This project was created to advance both my learning Japanese and my technical knowledge. This README serves to introduce and showcase the apps' functionality. This work is private and is not something I plan to release or develop for anyone else to contribute to or utilize.

Attribution

The videos and caption data shown in the images and gifs are from Comprehensible Japanese. I highly recommend them if you are just starting to learn Japanese and are looking for easy to understand listening practice.

App Features

Video Player

The video player itself is built on top of react-native-video. I added my own play/pause, restart video, and progress bar under the player instead of using the built in controls provided.

To accompany the video player, there is an auto-scrolling caption feed. It may feel familiar if you have used the lyric viewer in Apple Music or Spotify's music players. While the video plays the current caption scrolls into the users' view. One tap on a caption enables the user to go back or forward to play the video from where that caption belongs.

Translations

By pressing and holding a caption, the English translation can be shown. Requests are made directly to the DeepL Translation API. Currently, translating from Japanese to English is only supported. By adding language data to the caption file and a global language setting (or using the device's language) this could be extended to support content in any language.

Theming

Having a dark and light mode is a quality of life feature I appreciate in apps. At first I experimented with React Navigation's built-in theme support but didn't see a way to support animating between colors. I kept their method of defining themes as a grouping of background, card, text, and border colors. I moved animated values for those colors into a React Context so that any component in the app can access the correct colors.

Technical Details

This section will outline some of the details on how I prepare videos for storage and how the application retrieves the data about which videos are available.

Data Storage / Backend

All the video and caption data is stored in S3 (another reason I won't be releasing the app). The app itself does not store a hardcoded list of available content. It requests a top-level manifest file (from https://[s3 bucket]/manifest.json) and receives a list of available directories in the bucket. The identifiers found in the manifest file are then made more human readable and shown on the apps home screen dynamically. Below is an example of manifest.json

[
  {"title": "complete-beginner/", "type": "folder"},
  {"title": "beginner/", "type": "folder"},
  {"title": "intermediate/", "type": "folder"}
]

When navigating to one of those folders in the app a manifest file is requests from that directories root (i.e. https://[s3 bucket]/beginner/manifest.json). This manifest lists all the videos

[
  {
    "title": "a-day-at-ohori-park",
    "type": "video"
  },
  {
    "title": "patreon-intro",
    "type": "video"
  },
  ...
  ...
]

The video file, caption data, and thumbnail are all stored in this directory with the same base file name with different extensions (.mp4, .json, .png).

Video download and processing

Prerequisites:

yt-dlp is used to download videos and captions from YouTube. youtube-dl worked for me at one point, but stopped after a few months. yt-dlp is a fork that does the same thing. pip3 install yt-dlp
ffmpeg is used to convert the thumbails downloaded by yt-dlp from .webp to .jpg. It can be downloaded on the FFmpeg site

Note: I'm in the middle of changing some of the scripts in /content to be used for generic YouTube downloading. It was previously tied to some automation and web scraping of the Comprehensible Japanese website. I did not wish to have that code public, so I'm making a CLI to add single YouTube videos at a time to the project's video storage S3 bucket.

To add videos, run the CLI script using node content/manager. It presents some options to edit the manifest files and videos stored in S3. (Only adding videos is currently implemented)

The general flow of downloading videos is

The CLI requests which category the video is part of. If none exist or a new category is needed that is completed in this step. This step creates or modifies existing manifest files in S3
The CLI requests the video title and its YouTube ID
Using youtube-dl the video, japanese captions, and thumbnail are downloaded
The caption file is parsed and converted into a JSON file for use by the app. During this step furigana is automatically added to the captions. User input may be required to confirm ambiguous kanji readings
All files are uploaded to S3 and the manifest file of the selected category is updated

Check out the files if you want a complete run-through of how this actually works.

Future improvements

Keyboard support. Like tablet support, it's not super useful all the time. I'm more interested in learning how to implement it in the context of a mobile app.

dungxtd/Caption-Reader