Missing hex and decoded data in `.ccd` output for old broadcast recording
micolous opened this issue · 1 comments
I attempted to run caption-inspector on an old US TV broadcast with CEA-608 captions, reproduced from hls.js demo page: https://playertest.longtailvideo.com/adaptive/captions/playlist.m3u8
I downloaded the recording using youtube-dl
, and have attached it to this issue (in a ZIP file so GitHub doesn't try to transcode it): cnn-live.mp4.zip
I was able to play back the downloaded file with captions fine in VLC:
I ran caption-inspector
on Ubuntu 20.04 at commit 476326f, and patched the Makefile
to build with gcc 9.3 rather than clang
(Issue #13).
I then tried to extract the CEA-608 tracks with:
mkdir /tmp/cnn
./caption-inspector -o /tmp/cnn cnn-live.mp4
cd /tmp/cnn
zip -9 cnn.zip cnn-live*
All outputs I got are as attached: cnn.zip
I got a correct-looking cnn-live-C1.608
with captions from the program:
00:00:00,755 - {RCL} {ENM} {ENM} {R1:C4} {R1:C4} {TO2} {TO2} "BUT HE HAD PURINA CAT CHOW" {R2:C16} {R2:C16} {TO3} {TO3} "INDOOR."
00:00:01,930 - {EOC}
However, cnn-live.ccd
appears to have timestamps and fully-decoded data, but appears to be missing "hex data" and "decoded data":
00:00:01,049
TEXT: Ch1 - "BU"
00:00:01,091
TEXT: Ch1 - "T "
00:00:01,133
TEXT: Ch1 - "HE"
00:00:01,175
TEXT: Ch1 - " H"
00:00:01,217
TEXT: Ch1 - "AD"
I was able to run caption-inspector
against a different US broadcast capture which is a little more modern (720p59.94 with CEA-608 and 708 captions) and files created with libcaption's flv+srt tool (which produces possibly-not-quite-valid CEA-608 captions), and I got proper "hex data" and "decoded data":
00:00:01,936 F1:5468 PS:4322 PD:5468 PD:0000 XD:0000 Ch1: "Th" <-Srvc:01 G0:T|G0:h ?00?|?00? _________ Chan-1: "T" "h" <--Seq:1 P006-B02 G0Svc:01|G0Svc:01 ???-0x00|???-0x00 _________________
XD:0000 XD:0000 XD:0000 XD:0000 XD:0000 _________ _________ _________ _________ _________ _________________ _________________ _________________ _________________ _________________
TEXT: Ch1 - "Th" Svc1 - "Th"
00:00:01,952 F2:8080 XD:0000 XD:0000 XD:0000 XD:0000 F2 - NULL _________ _________ _________ _________ 608: Field 2 NULL _________________ _________________ _________________ _________________
XD:0000 XD:0000 XD:0000 XD:0000 XD:0000 _________ _________ _________ _________ _________ _________________ _________________ _________________ _________________ _________________
00:00:01,969 F1:E5F2 PS:8322 PD:6572 PD:0000 XD:0000 Ch1: "er" <-Srvc:01 G0:e|G0:r ?00?|?00? _________ Chan-1: "e" "r" <--Seq:2 P006-B02 G0Svc:01|G0Svc:01 ???-0x00|???-0x00 _________________
XD:0000 XD:0000 XD:0000 XD:0000 XD:0000 _________ _________ _________ _________ _________ _________________ _________________ _________________ _________________ _________________
TEXT: Ch1 - "er" Svc1 - "er"
00:28:17,000 F1:94AE F1:9420 F1:9140 F1:C7F2 F1:E561 Ch1 {ENM} Ch1 {RCL} Ch1 - PAC Ch1: "Gr" Ch1: "ea" Erase NonDisp Mem ResumeCaptLoading _Row:01 - White_ Chan-1: "G" "r" Chan-1: "e" "a"
F1:F420 F1:F7EF F1:F26B F1:AE80 F1:91E0 Ch1: "t " Ch1: "wo" Ch1: "rk" Ch1 - "." Ch1 - PAC Chan-1: "t" " " Chan-1: "w" "o" Chan-1: "r" "k" Channel - 1: "." _Row:02 - White_
F1:5B4C F1:6175 F1:6768 F1:F4E5 F1:F25D Ch1: "[L" Ch1: "au" Ch1: "gh" Ch1: "te" Ch1: "r]" Chan-1: "[" "L" Chan-1: "a" "u" Chan-1: "g" "h" Chan-1: "t" "e" Chan-1: "r" "]"
TEXT: Ch1 - "Great work.[Laughter]."
I'm pretty sure that the issue is triggered by the source file having cc_count < 5
. Caption Inspector only tries to print anything if there are at least 5 blocks:
caption-inspector/src/sink/cc_data_output.c
Lines 257 to 274 in 476326f
This then trips an assert later on:
caption-inspector/src/sink/cc_data_output.c
Line 305 in 476326f