anidl/multi-downloader-nx

[Feedback]: Problems with vtt2ass on new Q style Hidive subtitles

Closed this issue · 12 comments

Type

Both

Suggestion

There are 3 main new issues with the new Hidive subtitle format that uses Q1, Q2, etc. I'll refer to Urursei Yatsura ep 34 for examples.

\N false negative

The "\N" adding code used checks if lines start at the same time but, events can be rounded to different centisecond values causing them to not have the same start time, thus not be joined.

Q0
00:21:42.333 --> 00:21:44.041 line:70%
<c.Q0>Is that... tempura I smell...?</c>

Q28
00:21:42.334 --> 00:21:44.041 line:77%
<c.Q28>What? But selling tempura</c>

Q28
00:21:42.335 --> 00:21:44.041 line:84%
<c.Q28>door-to-door is unheard of!</c>

The two Q28 lines should be joined here but they are rounded to different start times.

Dialogue: 0,0:21:42.33,0:21:44.04,Q28,,0,0,0,,What? But selling tempura
Dialogue: 0,0:21:42.34,0:21:44.04,Q28,,0,0,0,,door-to-door is unheard of!
Dialogue: 0,0:21:44.04,0:21:45.46,Q0,,0,0,0,,Is that... tempura I smell...?

Reverse order on all simultaneous lines

The vtt converter adds lines from top to bottom but ass renders the first line at the bottom and then stacks the next one on top so every simultaneous line event shows up on screen in reverse order. Example, this and also the last example:

Dialogue: 0,0:15:39.21,0:15:42.46,Q28,,0,0,0,,Lum! Lum! Lum! Lum!
Dialogue: 0,0:15:39.21,0:15:42.46,Q0,,0,0,0,,Aw, jeez. These things are so obnoxious!

Caption events can be joined by \N

When they are joined by \N, their position tags don't work so they are not positioned properly. Example:

Dialogue: 0,0:22:45.63,0:22:50.33,Q10,,0,0,0,,{\pos(435.2,165.6)}{\an7}L♥vely Darling in Danger!!\N{\pos(460.8,259.2)}{\an7}Foxes in the Moonlight\N{\pos(460.8,345.6)}{\an7}The Home Visit Blues:\N{\pos(422.4,396)}{\an7}Mark Onsen Goes to Space

Thanks for this detailed feedback! I was quite worried about potential problems with this new logic, but wasn't sure in what form they would be. I'll look into implementing the below fixes:

\N false negative

This can probably be fixed by checking for +- 1-3 centiseconds.

Reverse order on all simultaneous lines

I noticed this issue while I was adding the new Hidive API, and it was the cause for adding the multiline subtitle merging logic in the first place (It seemed like the easiest fix).

Caption events can be joined by \N

This can probably be fixed by checking for the \pos tag and ignoring the subtitle if present

Alright, so I started looking into the false negative, and ran into a weird issue. For me, the time only goes up to the 2nd decimal point, it doesn't go up to 3 decimal points. What OS and version are you using (prebuilt binary? code? node version?)?

I did fix the issue with subtitles with positional data being merged though.

For the vtt? I just checked it through the website directly, not sure if it's different for api.

Alright, next release should have the fixes. If you are able, could you compile the latest commit and give it a test?

Unfortunately, I don't have a CDM to test Hidive but looking at the code, it looks like problem 1 and 3 should be solved. There's still the issue in problem 2 that I don't think is addressed though unless I'm missing something. When they are different styles, the lines won't be joined and they will display in the wrong order.

Either you could reverse the order the lines appear in the .ass (then you have to be careful about the differently rounded start times) or you could try to add the \N\r style thing back to preserve the order.

I think fixing 1 should effectively fix 2, but correct me if I'm wrong. Also is there a source on .ass rendering things backwards? Closest I could find was this: http://www.tcax.org/docs/ass-specs.htm under general information number 4, which says that theoretically the order of events shouldn't matter? And when I check .ass subtitles on say Crunchy, they are on chronological order rather than reverse order

Not so much backwards, it's more like it stacks from bottom up intentionally. Order doesn't matter except for when events start at the same time.

Dialogue: 0,0:15:39.21,0:15:42.46,Q28,,0,0,0,,Lum! Lum! Lum! Lum!
Dialogue: 0,0:15:39.21,0:15:42.46,Q0,,0,0,0,,Aw, jeez. These things are so obnoxious!

is like this on playback
Converted

But on hidive it displays like this:
Hidive

Looked into it a little more, and this is likely because we're not parsing the line % currently for non captions, at least that was the logic for the old API when we could tell what was captions and what weren't.
If line 337 is changed to this:

let type = txt.style.match(/Caption/i) || txt.style.match(/Q/i) ? 'caption' : (txt.style.match(/SongCap/i) ? 'song_cap' : 'subtitle');

It parses the line %. But for whatever reason, hidive has their line %'s unusually high, and I think some offsetting is going on in their player. So the subtitles end up super high like this:
image

Those two generated lines with that enabled:

Dialogue: 0,0:15:39.21,0:15:42.46,Q28,,0,0,0,,{\pos(640,554.4)}Lum! Lum! Lum! Lum!
Dialogue: 0,0:15:39.21,0:15:42.46,Q0,,0,0,0,,{\pos(640,604.8)}Aw, jeez. These things are so obnoxious!

I'm not sure inverting the subtitle list is the right call, but I could give it a shot. It would theoretically be as easy as changing line 190 to this:

const lines = vttStr.replace(/\r?\n/g, '\n').split('\n').reverse();

and then inverting the subtraction of currentStart and previousStart

I also don't think \pos on every line is a great idea. You just need to reverse lines with the same start time. So maybe just make sure that start times within +- a few centiseconds get the same start time in .ass. Then do a second pass to reorder lines with the same start time. Optionally, you could also exclude reordering lines with a pos tag, but that doesn't matter too much.

Alright I think (hope) that the above works. Lol

I think we're almost there. You're gonna hate me but I have an edge case here where that might not work.
https://www.hidive.com/video/586601?seasonId=20687

Q0
00:00:24.833 --> 00:00:26.291 line:70%
<c.Q0>Tsk...</c>

Q2
00:00:24.834 --> 00:00:26.291 line:77%
<c.Q2>Whoa, look at it go!</c>

Q0
00:00:24.835 --> 00:00:26.291 line:84%
<c.Q0>It's gettin' everywhere!</c>

Basically it's not gonna work for 3+ lines with same start it'll be

line 1 -> line 2
line 2 -> line 3
line 3 -> line 1

instead of

line 1 -> line 3
line 2 -> line 2
line 3 -> line 1

Also if they get rounded to different start times, that might also mess up the order so that's what I was saying about aligning them to the same start time if they're similar enough. Sorry to be so specific about this issue, I work with hidive subtitles for some of my projects so I just want this to be correct, thanks for being so responsive.

After a lot of thinking, I couldn't come up with a better method that could actually be implemented without massively overcomplicating things. So I have now inversed all the subtitles, which will hopefully be more accurate (thanks hidive for inverting subtitles that have the same start time). I guess let me know if that actually works as intended, but initial tests looks like it might.