hang-qi/CDI

Combine half-hour transcirpts

hang-qi opened this issue · 0 comments

Some transcripts for one-hour shows may be separated into two half-hour files. We need to combine them to perform matching.

-rw-r--r-- 1 csa csa 20355 May 31 09:07 2013-05-07_0900_US_CNN_Early_Start_With_John_Berman_And_Zoraida_Sambolin.rawtxt -rw-r--r-- 1 csa csa 23078 May 31 09:07 2013-05-07_0930_US_CNN_Early_Start_With_John_Berman_And_Zoraida_Sambolin.rawtxt -rw-r--r-- 1 csa csa 23930 May 31 09:07 2013-05-07_1000_US_CNN_Early_Start_With_John_Berman_And_Zoraida_Sambolin.rawtxt -rw-r--r-- 1 csa csa 18198 May 31 09:07 2013-05-07_1030_US_CNN_Early_Start_With_John_Berman_And_Zoraida_Sambolin.rawtxt -rw-r--r-- 1 csa csa 26353 May 31 09:07 2013-05-07_1100_US_CNN_Starting_Point_With_Soledad_OBrien.rawtxt -rw-r--r-- 1 csa csa 25080 May 31 09:07 2013-05-07_1130_US_CNN_Starting_Point_With_Soledad_OBrien.rawtxt -rw-r--r-- 1 csa csa 48570 May 31 09:07 2013-05-07_1200_US_CNN_News_Stream.rawtxt -rw-r--r-- 1 csa csa 25878 May 31 09:07 2013-05-07_1200_US_CNN_Starting_Point_With_Soledad_OBrien.rawtxt -rw-r--r-- 1 csa csa 19794 May 31 09:07 2013-05-07_1230_US_CNN_Starting_Point_With_Soledad_OBrien.rawtxt -rw-r--r-- 1 csa csa 33623 May 31 09:05 2013-05-07_1300_US_CNN_Newsroom.rawtxt -rw-r--r-- 1 csa csa 28194 May 31 09:05 2013-05-07_1330_US_CNN_Newsroom.rawtxt -rw-r--r-- 1 csa csa 29825 May 31 09:05 2013-05-07_1400_US_CNN_Newsroom.rawtxt -rw-r--r-- 1 csa csa 20310 May 31 09:06 2013-05-07_1430_US_CNN_Newsroom.rawtxt -rw-r--r-- 1 csa csa 26273 May 31 09:06 2013-05-07_1500_US_CNN_Newsroom.rawtxt -rw-r--r-- 1 csa csa 20868 May 31 09:06 2013-05-07_1530_US_CNN_Newsroom.rawtxt -rw-r--r-- 1 csa csa 24766 May 31 09:05 2013-05-07_1600_US_CNN_Around_World.rawtxt -rw-r--r-- 1 csa csa 21451 May 31 09:05 2013-05-07_1630_US_CNN_Around_World.rawtxt