spaam/svtplay-dl

Subtitles downloaded from TV4 Play will sometimes be merged when they should not be

Opened this issue · 2 comments

svtplay-dl versions:

4.69

Operating system and Python version:

PRETTY_NAME="Raspbian GNU/Linux 10 (buster)"
NAME="Raspbian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=raspbian
ID_LIKE=debian

Python 2.7.16
Python3 3.7.3

What is the issue:

When downloading subtitles from TV4 Play, sometimes two consecutive subtitles with the same text will be combined as one when they should be separate subtitles.

Example: Episode 5 of Tinka och själens spegel
svtplay-dl -S "https://www.tv4play.se/video/791ac92deef1e71c220e/flora"

What it should be like:
44
00:05:32,480 --> 00:05:35,680
-Hallå?
-Det är jag.

45
00:05:57,920 --> 00:06:01,560
-Hallå?
-Det är jag.

What svtplay-dl produces: (Note that this subtitle stays on for 29 seconds!)
44
00:05:32,480 --> 00:06:01,560
-Hallå?
-Det är jag.

yeah i see the issue. 🤔 the issue is related to the split subtitles files and we look at the time code then the text. if the text is the same we just update the end time code ( as you can see ).

I think that the solution would be to look at the end time of the first subtitle, and the start time of the next. Only combine them when the difference is zero or very small.