NUS-HPC-AI-Lab/VideoSys

A question of preprocess `ucf-101` ๐Ÿค—

AoqunJin opened this issue ยท 1 comments

I have the dataset of ucf-101 and it seems format mismatch with the preprocess.py.

My ucf-101 has 2 folder: (From https://www.crcv.ucf.edu/data/UCF101.php)

The UCF-101

$ tree -L 1 UCF-101/
UCF-101/
โ”œโ”€โ”€ ApplyEyeMakeup
โ”œโ”€โ”€ ApplyLipstick
โ”œโ”€โ”€ Archery
...
โ”œโ”€โ”€ WritingOnBoard
โ””โ”€โ”€ YoYo

And The ucfTrainTestlist

$ tree -L 1 ucfTrainTestlist/
ucfTrainTestlist/
โ”œโ”€โ”€ classInd.txt
โ”œโ”€โ”€ testlist01.txt
โ”œโ”€โ”€ testlist02.txt
โ”œโ”€โ”€ testlist03.txt
โ”œโ”€โ”€ trainlist01.txt
โ”œโ”€โ”€ trainlist02.txt
โ””โ”€โ”€ trainlist03.txt

Even I can process them with a script, but

How to deal with that? ๐Ÿค—โค

This works.

import csv

def split_by_capital(name):
    # BoxingPunchingBag -> Boxing Punching Bag
    new_name = ""
    for i in range(len(name)):
        if name[i].isupper() and i != 0:
            new_name += " "
        new_name += name[i]
    return new_name

class_d = {}
with open("./ucfTrainTestlist/classInd.txt", "r") as f:
    class_l = f.readlines()    
    for kv in class_l:
        k, v = kv.strip("\n").split(" ")
        class_d[k] = v

data_l = []
with open("./ucfTrainTestlist/trainlist01.txt", "r") as f:
    data_l.extend(f.readlines())
with open("./ucfTrainTestlist/trainlist02.txt", "r") as f:
    data_l.extend(f.readlines())
with open("./ucfTrainTestlist/trainlist03.txt", "r") as f:
    data_l.extend(f.readlines())

for i in range(len(data_l)):
    k, v = data_l[i].strip("\n").split(" ")
    data_l[i] = "./videos/UCF-101/" + k, split_by_capital(class_d[v])

with open("./ucfTrainTestlist/data_index.csv", "w") as f:
    writer = csv.writer(f)
    writer.writerows(data_l)

print("Finish!")