declare-lab/MELD

Wrong video files

ATriantafyllopoulos opened this issue · 1 comments

I used this link to download the audio files for this data set.

However, there are a few problems with at least a few of the video files and/or their transcriptions:

  • dia309_utt0.mp4: transcription contains description of scene which needs to be removed ("She doesn't hear him and keeps running, Chandler starts chasing her as the theme to")

  • test_splits_wav/dia220_utt0.mp4: file is wrongly cut (video is 4min long - transcription is way off as it's Ross and Julie meeting Rachel at the airport, not Phoebe talking to Joey )

  • test_splits_wav/dia38_utt4.mp4: file is wrongly cut (video is 5min long)

  • train_splits_wav/dia309_utt0.mp4: file is wrongly cut

In addition, I was able to verify that some of the old problems reported here still persist (e.g. dia793_utt0.mp4).

Have they been solved? Have I perhaps downloaded an old version of the data set?