sign-language-processing/datasets

DGS Types

Closed this issue · 6 comments

On DGS types, there are multi glosses corresponding one pose file. Is it possible ?

For example :

glosses=[ gloss1,gloss2,gloss3]

corresponding to one pose file according to id ?

shoul I take gloss1 ?

I am trying to create dictionary like :

dict={ gloss : correspondig pose ,
....}

AmitMY commented

It is possible, and you should probably take all of them.
This happens when the sign has the same form but multiple meanings.

Looking at types you can see it:
image

In this screenshot, you see OBERFLÄCHE1 and BEREICH1A^ are both the same form https://www.sign-lang.uni-hamburg.de/meinedgs/types/type13215_de.html

Also , I am gettin this error ,

config = SignDatasetConfig(name="dgs_types_poses", version="1.0.0", include_video=False, process_video=False, include_pose="holistic")
dgs_types = tfds.load('dgs_types', builder_kwargs=dict(config=config), data_dir='/content/drive/MyDrive/sing_language_datasets')

TypeError: Error while serializing feature views/pose/data: TensorInfo(shape=(None, None, 1, 576, 3), dtype=float32): 'NoneType' object cannot be interpreted as an integer

It is possible, and you should probably take all of them. This happens when the sign has the same form but multiple meanings.

Looking at types you can see it: image

In this screenshot, you see OBERFLÄCHE1 and BEREICH1A^ are both the same form https://www.sign-lang.uni-hamburg.de/meinedgs/types/type13215_de.html

But in some samples ,the glosses are quite different each other. Should I accept these have same pose ?

I am doing like this


config = SignDatasetConfig(name="dgs_types_video", version="1.0.0", include_video=False, process_video=False, include_pose=None)
dgs_types = tfds.load('dgs_types', builder_kwargs=dict(config=config), data_dir='/content/drive/MyDrive/SignLanguage/dataset_slp')
decode_str = lambda s: s.numpy().decode('utf-8')
c_dict=defaultdict()
for datum in tqdm(dgs_types["train"]):
    _id = decode_str(datum['id'])
    pose_file_path=f"/content/drive/MyDrive/SignLanguage/dgs_poses/{_id}_frontal.pose"
    !wget -q {pose_download_path} -P /content/drive/MyDrive/SignLanguage/dgs_poses
    pose_file_path=f"/content/drive/MyDrive/SignLanguage/dgs_poses/{_id}_frontal.pose"
    if os.path.exists(pose_file_path):
      for gloss in datum["glosses"]:
          gloss=correct_gloss(gloss)
          c_dict[gloss]=f"{_id}_frontal.pose"

AmitMY commented

TypeError: Error while serializing feature views/pose/data: TensorInfo(shape=(None, None, 1, 576, 3), dtype=float32): 'NoneType' object cannot be interpreted as an integer

Could you please open a different issue forthis?

But in some samples ,the glosses are quite different each other. Should I accept these have same pose ?

This is according to the Hamburg people. I think you should accept the same pose despite the different words.

I am doing like this

Your code looks good to me

Thanks AmitMy . Yours works are amazing :)