Support screencasts
tomschr opened this issue · 6 comments
Feature description
For our doc evolution project we would like to support screencasts.
Expected behavior
After GeekoDoc and stylesheets are adapted to support screencasts, DAPS needs to find the file and copy/link it to the output folder.
However, we need to clarify some non-technical things:
- some naming convention?
- which file format do we want to support (MP3, Webm, ...?)
- where do we want to store our screencasts?
Inside (images/src/screencasts
?) or outside of our repository? Keep in mind, although a screencast should be quite short, the amount and our maintenance branches can make it big. - decide if it's necessary to do some post-production or leave it as is? (Maybe something which is outside of daps' control.)
- limit the screencast to a maximum duration?
- how do we deal with screencasts for PDF?
- Embed it? Probably not a good idea as that makes the PDF quite big. It could be an idea IF the file is really short. However, it's not clear yet, if FOP supports that.
- Where to link to the screencast?
- Should we omit the problem and present an alternative presentation? The
<textobject>
could be a way.
References
- GeekoDoc: openSUSE/geekodoc#95
- SUSE stylesheets: openSUSE/suse-xsl#440
which file format do we want to support (MP3, Webm, ...?)
- We will probably need both (1) MOV+H.264+AAC (.mp4 or .m4v) and (2) WebM+VP8/VP9+Vorbis/Opus (.webm). Both H.264 and VP8/VP9 are supported in all major browsers now, but: H.264 is patent-encumbered and not playable on every Linux machine. And VP8/VP9 have support issues on older Safari versions--desktop Safari is barely relevant, but Safari on iOS is very important.
This means you probably need both formats available for every video, but WebM-only is becoming an option. - The AV1 video format is also upcoming but not yet widely supported--I think it's using WebM as the container format as well though.
- The successor of H.264 is called H.265, but it's basically dead on arrival, because AV1 is technically better, patent-free, and Apple has finally seen the light of free video formats.
- We should definitely not support GIF as a file format. GIFs have huge file sizes, are energy-intensive to play, are restricted to 256 colors, don't support audio and need to be embedded with the
<img>
tag rather than with the<video>
tag. - It's worth thinking about subtitles. Those can be supplied in WebVTT format (a relatively simple plain-text file format with a timing/subtitle/empty line pattern).
You can either use subtitles in addition to a human speaker or just supply subtitles and remove the speaker. E.g. GNOME uses no audio, just subtitles in its explanatory videos. This makes their videos very easy to translate, allows changing the text on the fly, reduces file sizes and lets them off the hook with regards to needing professional speakers, but it's definitely not as engaging.
where do we want to store our screencasts?
Inside (images/src/screencasts?) or outside of our repository? Keep in mind, although a screencast should be quite short, the amount and our maintenance branches can make it big.
They should probably be put them somewhere outside of a Git repo. Or in Git LFS maybe (I never bothered to check that out much though).
For output, it would be possible to use an outsourced video service, like YouTube (bah--it has ads), Vimeo (has paid plans without ads but not WebM-friendly iirc), DailyMotion (like YouTube, I think). The advantage of outsourcing is that the service will transcode the video into the necessary formats/sizes automatically and also create a preview image (which you really should have because otherwise people will have to click on an unattractive black rectangle to start the video). The disadvantage is that you need to find a way to include the outsourced video widget in your page (meaning you need to add a <script>
tag rather than a <video>
tag) and need to find a reliable provider that does not serve ads. Given how important China is to SUSE, it's probably also important to find something that's not blocked in China (bah).
limit the screencast to a maximum duration?
YouTube videos have sharp drop-off rates after ~3-5 minutes, iirc (the number is from a Mozilla study that they did to find out how long people will watch their screencasts). It's probably best to stay below that limit and just do more short videos. This also makes the videos more accessible and easier to skip around in.
how do we deal with screencasts for PDF?
- Embed it? Probably not a good idea as that makes the PDF quite big. It could be an idea IF the file is really short. However, it's not clear yet, if FOP supports that.
- Where to link to the screencast?
*Should we omit the problem and present an alternative presentation? The could be a way.
FOP definitely does not support embedding videos in PDFs. I am also not sure if it's still possible to do that at all--it used to work via Flash, but Flash is dead now. So, using a preview image that links to somewhere else really is the best option.
I think the model should be as follows:
- Someone creates a screencast.
- Someone uploads the screencast to a transcoding server, to make sure we get H.264 + WebM versions of each video of a reasonable and consistent file size/quality and have preview images available. We could also consider creating m3u playlists to allow better streaming--this can be used to add a bit of copy-protection too, so it's at least harder to hammer the server with requests until it's dead.
- The transcoding server uploads to a video host server.
- Someone links the finished video into their document.
- When the finished document is viewed:
- HTML document downloads preview image + screencast from the video host
- PDF document shows a generic preview image, e.g. just
Play |>
or something like that which links directly to the video host.
Steps 2+3 can be outsourced but don't have to be outsourced.
- Setting up transcoding is reasonably straightforward, especially at our small scale where we don't care about getting the maximum performance per currency unit. Transcoding may be doable via GitHub Actions but I am not sure that GitHub would be happy about it (it would be "free" though). docserv.suse.de should be able to do it too of course.
- The video host can just be a very basic Apache server but it needs to be able to handle many large downloads. Infra may be unhappy about us using their AWS servers for this purpose (need to ask). It may be an option to upload to Image Relay, because at least that is made for large downloads but it's a bit of a question if we will be happy with linking to their random super-long file names and are happy about that system auto-deleting content after a few years. I also don't know if Image Relay allows for viewing anything without login or uploading via an API. The proper option here is to set up a CDN, I guess.
Thanks Stefan for your well-put and thoughtful answer. I would like to comment on one aspect: the "transcoding server".
Maybe my impression of "server" is a bit more voluminous than you have anticpicated it. Not sure if this is really needed. Could we have a more "lightweight" solution? Something like this:
- Someone creates a screencast.
- Someone adds it to the XML source.
- Daps and our stylesheets do:
a. create the<video>
element with different<source>
elements.
b. transcodes the input screencast to different output formats.
c. creates a preview image automatically.
d. saves everything in thebuild
directory. - When the finished document is viewed:
- HTML document downloads preview image + screencast from the directory (or from our SUSE host)
The browser selects the first format it understands. - PDF document shows a generic preview image, e.g. just Play |> or something like that which links directly to our SUSE host.
- HTML document downloads preview image + screencast from the directory (or from our SUSE host)
So basically, daps could treat screencasts like images. Sure, some things are still unclear (where to store the screencasts?). However, I think it has some benefits and would solve some issues:
- We are in full control of our files (no issues with China).
- We don't need to pay nor adhere to third-party video sites.
- We don't have to deal with issues like sharp drop-off rates, ads etc.
- We delegate the selection of formats to the browser (see also the article "Creating a cross-browser video player" from Mozilla)
Some cons:
- We need to amend daps to support this
- We need to amend our stylesheets to support the necessary tags and output HTML5
- License issues due to codecs, transcoding etc.?
- Speed?
- Linking in PDF?
Maybe it's not such a great idea.
- Daps and our stylesheets do:
[...]
b. transcodes the input screencast to different output formats.
c. creates a preview image automatically.
Essentially, you're taking my idea of the transcoding server and decentralizing it by integrating it into DAPS. [ The "server" I suggest does not have to "voluminous" at all--essentially, we just need something that can take an input video and run a few ffmpeg commands on that (transcode into necessary video format(s) and create preview image(s)), move output to a specified location, then we can sync everything up as usual. It's basically the same thing you'd also do from within DAPS, except independently from DAPS. ]section added later
In general, decentralizing and sticking to DAPS is not a bad idea. However, for various reasons, this is problematic here:
- Legal: It would mean adding a dependency on something like
ffmpeg
+ an H.264 codec to DAPS. That may work as long as we fly under the radar, but we're definitely not going to be able to put a DAPS package with those dependencies into openSUSE or SLE. - Usability: If anyone wants to be able to use the video stuff, they will have to add the Packman Essentials repo first. It's the same issue as with GeekoDoc, where people don't know they need a separate repo; except in this case, they need to go through an ad-infested third-party page first to find the right repo URL.
- PDF: To be able to create a link from the PDF to the video, you need a (stable) video URL first. When the video process is embedded into the DAPS build process, you are not going to have that stable URL.
- Time to build: Video transcoding tends to take a long time, adding that into DAPS builds may not be what people expect.
- Duplication of transcoding work/video files: Imagine building a HTML and then a single-HTML version of the same document. Depending on how the video pipeline works, you may not have to retranscode the video (unless you use
--force
), but you'd definitely end up with multiple copies of potentially large video files. - Storage: as I mentioned earlier, videos can be large and will long-term likely need to move to a different server, for bandwidth reasons. We really should try to make sure they can easily be rehosted elsewhere when the time comes.
- [ Binding the video stuff to the DAPS release cycle: Like most of our tools, DAPS has a fairly unpredictable release cycle. The video stuff will need many releases early on and can easily be made functionally independent. So, I see little reason for coupling it to DAPS. ]bullet added later
We don't have to deal with issues like sharp drop-off rates, [...]
The drop-off rates are related to the length of videos not to their hosting. The reason why I mentioned YouTube in that sentence is because YouTube provides analytics which parts of videos are watched most and at what point people leave the video.
Ah, so one more option that may be interesting to us, Asciinema: https://github.com/asciinema/asciinema-player
This uses a JSON-based text "video" format that could efficiently cover a lot of the cases where we (1) only need a command line and (2) don't need audio. Examples: https://asciinema.org/a/163778 or (more graphics!) https://asciinema.org/a/333393
Good point!
I guess for terminal sessions/screencasts, the Asciinema solution would probably the best in terms of efficiency, bandwith, support, copy-and-paste, license etc. (although I haven't checked the details yet).
For GUI related screencasts, we could still use an usual video format.
That boils down to the question: do we want to support Asciinema in our docs?
Closing this, since delivering videos with DAPS does not seem to be a good idea. Videos should rather be embedded via a video hosting platform.
If you want Asciinema please create a separate issue.