cvlab-columbia/hyperfuture

Question for the finegym dataset when testing model

Closed this issue · 8 comments

Hi,

In the script file for running test on finegym dataset there is a parameter "--path_dataset /path/to/datasets/FineGym" .

I am confused about what should be the content in the specific path and where can I get it (I only see notations of finegym in the given link), cause if I download data_info.tar.gz, unzip it to root folder, and assign the parameter when running main.py as
"--path_dataset dataset_info/finegym" ,
I will get error like this
Screen Shot 2022-05-03 at 12 42 18 AM

Thanks in advance for your time and help!

The dataset has to be downloaded from the official FineGym channel (the given_link you mentioned). Once the data is downloaded, you have to link to it using --path_dataset.

The given_link contains files with the name of the YouTube videos. You have to download those YouTube videos using any YouTube downloader software, and then split them by "events", "actions" (within events) and "stages" (within actions) following the start and end times also provided by those files.

The specific structure we use is:
FineGym:

  • annotations: directory with all the files under the v1.1 section in the given link.
  • categories: directory with all the files under the "Categories" section in the given link.
  • event_videos: videos already split by event, with ID of the form {YouTubeID}_E_{start_event:06d}_{end_event:06d}.mp4
  • action_videos: event videos split by action, with ID of the form {YouTubeID}_E_{start_event:06d}_{end_event:06d}_A_{start_action:06d}_{end_action:06d}.mp4
  • stage_videos: action videos split by stage of the action, with ID of the form {YouTubeID}_E_{start_event:06d}_{end_event:06d}_A_{start_action:06d}_{end_action:06d}_{stage_ID}.mp4

The information about division of events, actions, and stages is all in the files under v1.1 in the given link.

Hi Suris,

Got it, your detailed explanation is very helpful!

Do you have any recommendations on the youtube downloader or approach? Also since I only got some experience using you-get commands but it seems to be very complicated to download that many videos in specific format using it, I wonder whether it's possible to share your python script(or in other languages) for downloading and splitting the youtube videos in this repo?

Again, thank you in advance!

Hi,

This is the script that we used. You have to install youtube-dl beforehand:

import json
import os

output_path = './videos'
json_path = 'annotations/finegym_annotation_info_v1.1.json'

if not os.path.exists(output_path):
    os.mkdir(output_path)

data = json.load(open(json_path, 'r'))
youtube_ids = list(data.keys())

for youtube_id in youtube_ids:
    vid_loc = output_path + '/' + str(youtube_id)
    url = 'http://www.youtube.com/watch?v=%s' % youtube_id
    if not os.path.exists(vid_loc):
        os.mkdir(vid_loc)
    os.system('youtube-dl -o ' + vid_loc + '/' + youtube_id + '.mp4' + ' -f best ' + url)

Awesome! We will have a try, thank you so much : )

Hi Suris : )

I wonder what speed you can get when you using youtube-dl to download youtube videos, our speed is super low, around 60 kB/s, same situation when using you-get...
Screen Shot 2022-05-05 at 12 16 20 AM

By the way would you please share the python script for splitting and annotating the downloaded videos as well? It will save us lots of time 👍

Thank you for your help!!!

I do not know how to solve the speed issue. Sometimes Google limits the speed downloads, or it may be a problem from your (client) side. You can try downloading lower-quality videos (YouTubeDL has that option, I believe you can specify a quality).

The following is the script we used to process the FineGym videos. At the top there is (commented out) code to reduce the size of the videos in case you downloaded the high quality ones.

"""
Cut FineGym dataset videos into clips and sublicps.
Each clip contains the whole exercise (event)
Each subclip contains a move/step (segment, action)
"""

"""
Command to reduce video size before executing this code:
-n: Do not overwrite
-y: Do overwrite

for file in /proj/vondrick/datasets/FineGym/videos/*/*.mkv; do
    ffmpeg -y -i "$file" -c:v copy -c:a aac "${file%.*}.mp4"
done

for file in /proj/vondrick/datasets/FineGym/videos/*/*.mp4
do
    if ! grep -q _reduced <<< "$file"; then
        ffmpeg -n -i "$file" -vf "scale=max(256\,2*round((256*iw/ih)/2)):-2" -c:a copy "${file%.*}_reduced.mp4"
    fi
done

# To check video size
for file in /proj/vondrick/datasets/FineGym/videos/*/*.mp4
do
    if grep -q _reduced <<< "$file"; then
        aux=$(ffprobe -v error -select_streams v:0 -show_entries stream=width,height -of csv=s=x:p=0 $file)
        IFS='x' read -r size_width string <<< "$aux"
        if [ $((size_width%2)) -eq "1" ]; then
            echo $file
        fi
    fi
done
"""

import os
import json
import subprocess
from multiprocessing import Pool
from pathlib import Path

folder_dataset = '/path/to/FineGym'
# for 'segment' the 'events' have to be already extracted.
# for 'stage' the 'segments' have to be already extracted (only with events could be enough if there is no case of
# stages in videos with more than one segment)
to_extract = 'segment'  # ['event', 'segment', 'stage']


def main():
    with open(os.path.join(folder_dataset, 'annotations/finegym_annotation_info_v1.1.json'), 'r') as f:
        annotations = json.load(f)

    pool = Pool(processes=30)
    pool.map(process_video, annotations.items())
    # for item in annotations.items():
    #     process_video(item)


def process_video(inputs):
    video_id, events = inputs
    timestamps = []
    paths_new = []
    paths_original = []
    path_original_video = os.path.join(folder_dataset, 'videos', video_id, f'{video_id}_reduced.mp4')
    for event_id, event_data in events.items():
        event_label = event_data['event']
        event_timestamp = event_data['timestamps']
        name_clip = video_id + '_' + event_id
        path_clip = os.path.join(folder_dataset, 'event_videos', f'{name_clip}.mp4')
        if to_extract == 'event':
            paths_original.append(path_original_video)
            paths_new.append(path_clip)
            timestamps.append(event_timestamp[0])
        elif to_extract == 'segment' and event_data['segments'] is not None:
            for segment_id, segment_data in event_data['segments'].items():
                name_subclip = video_id + '_' + event_id + '_' + segment_id
                path_subclip = os.path.join(folder_dataset, 'action_videos', f'{name_subclip}.mp4')
                # this is only to extract the clips with more than 1 stage that were extracted incorrectly before
                # if len(segment_data['timestamps']) > 1:
                paths_original.append(path_clip)
                paths_new.append(path_subclip)
                ts = segment_data['timestamps']
                timestamps.append([ts[0][0], ts[len(ts)-1][1]])
        elif to_extract == 'stage' and event_data['segments'] is not None:
            num_segments = len(event_data['segments'])
            for segment_id, stages in event_data['segments'].items():
                if stages['stages'] > 1:
                    # Check if there is any case where #segments > 1 and there is a segment with more than one stage.
                    # I think it never happens.
                    if num_segments > 1:
                        print(f'This case happens in event {event_id}, segment {segment_id}')
                    for k in range(stages['stages']):
                        name_clip = video_id + '_' + event_id
                        path_clip = os.path.join(folder_dataset, 'event_videos', f'{name_clip}.mp4')
                        name_subsubclip = video_id + '_' + event_id + '_' + segment_id + '_' + str(k)
                        path_subsubclip = os.path.join(folder_dataset, 'stage_videos', f'{name_subsubclip}.mp4')
                        paths_original.append(path_clip)
                        paths_new.append(path_subsubclip)
                        timestamps.append(stages['timestamps'][k])

    extract_video(paths_original, paths_new, timestamps)


def extract_video(paths_original, paths_new, timestamps):
    for path_original, path_new, timestamp in zip(paths_original, paths_new, timestamps):
        if os.path.isfile(path_original) and not (os.path.isfile(path_new) and Path(path_new).stat().st_size > 1000):
            # -y overwrites
            instruction = f'ffmpeg -y -i {path_original} -ss {timestamp[0]} -to {timestamp[1]} -c:v libx264 -c:a copy {path_new}'
            subprocess.call(instruction, shell=True)


if __name__ == '__main__':
    main()

Hi Suris,

Thank you so much. The code for splitting FineGym dataset was effective and helpful, and we finally leveraged the "yt-dlp" tool to download the youtube video, the speed was quite satisfying, around 10MB/s!

Besides FineGym, we are also interested in HollyWood2, would you please share your code to split and annotate the HollyWood2 dataset as well 🥺

Many thanks!

I believe Hollywood2 comes already split in clips. You will need to convert the files from .avi to .mp4 if you want the video loader in our code. That is pretty straightforward to do for example with ffmpeg:

import os

input_folder = '/path/to/datasets/Hollywood2/AVIClips' 
output_folder = '/path/to/datasets/Hollywood2/AVIClips_mp4' 

os.makedirs(output_folder, exist_ok=True)

for file in os.listdir(input_folder):
    full_path = os.path.join(input_folder, file)
    if os.path.isfile(full_path):
        file_output = os.path.join(output_folder, file.replace('.avi', '.mp4'))
        os.system(f'ffmpeg -i {full_path} {file_output}')