a text-to-gloss-to-pose-to-video
pipeline for spoken to signed language translation.
-
Demos available for:
- 🇩🇪 Swiss German Sign Language 🇨ðŸ‡
- 🇫🇷 French Sign Language of Switzerland🇨ðŸ‡
- 🇮🇹 Italian Sign Language of Switzerland 🇨ðŸ‡
-
Paper available on arxiv, presented at AT4SSL 2023.
pip install git+https://github.com/ZurichNLP/spoken-to-signed-translation.git
Then, to download a lexicon, run:
download_lexicon \
--name <signsuisse> \
--directory <path_to_directory>
For language codes, we use the IANA Language Subtag Registry. Our pipeline provides multiple scripts.
To quickly demo it using a dummy lexicon, run:
git clone https://github.com/ZurichNLP/spoken-to-signed-translation
cd spoken-to-signed-translation
text_to_gloss_to_pose \
--text "Kleine Kinder essen Pizza." \
--glosser "simple" \
--lexicon "assets/dummy_lexicon" \
--spoken-language "de" \
--signed-language "sgg" \
--pose "quick_test.pose"
This script translates input text into gloss notation.
text_to_gloss \
--text <input_text> \
--glosser <simple|spacylemma|rules|nmt> \
--spoken-language <de|fr|it> \
--signed-language <sgg|ssr|slf>
This script converts a pose file into a video file.
pose_to_video \
--pose <pose_file_path>.pose \
--video <output_video_file_path>.mp4
This script translates input text into gloss notation, then converts the glosses into a pose file.
text_to_gloss_to_pose \
--text <input_text> \
--glosser <simple|spacylemma|rules|nmt> \
--lexicon <path_to_directory> \
--spoken-language <de|fr|it> \
--signed-language <sgg|ssr|slf> \
--pose <output_pose_file_path>.pose
This script translates input text into gloss notation, converts the glosses into a pose file, and then transforms the pose file into a video.
text_to_gloss_to_pose_to_video \
--text <input_text> \
--glosser <simple|spacylemma|rules|nmt> \
--lexicon <path_to_directory> \
--spoken-language <de|fr|it> \
--signed-language <sgg|ssr|slf> \
--video <output_video_file_path>.mp4
The pipeline consists of three main components:
- Text-to-Gloss Translation: Transforms the input (spoken language) text into a sequence of glosses.
- Simple lemmatizer,
- Spacy lemmatizer: more accurate, but slower lemmatization, covering fewer languages than
simple
, - Rule-based word reordering and dropping component
- Neural machine translation system.
- Gloss-to-Pose Conversion:
- Lookup: Uses a lexicon of signed languages to convert the sequence of glosses into a sequence of poses.
- Pose Concatenation: The poses are then cropped, concatenated, and smoothed, creating a pose representation for the input sentence.
- Pose-to-Video Generation: Transforms the processed pose video back into a synthesized video using an image translation model.
Language | IANA Code | Glossers Supported | Lexicon Data Source |
---|---|---|---|
Swiss German Sign Language | sgg | simple , spacylemma , rules , nmt |
SignSuisse (de) |
Swiss French Sign Language | ssr | simple , spacylemma |
SignSuisse (fr) |
Swiss Italian Sign Language | slf | simple , spacylemma |
SignSuisse (it) |
German Sign Language | gsg | simple , spacylemma , nmt |
WordNet (Coming Soon) |
British Sign Language | bfi | simple , spacylemma , nmt |
WordNet (Coming Soon) |
If you find this work useful, please cite our paper:
@inproceedings{moryossef2023baseline,
title={An Open-Source Gloss-Based Baseline for Spoken to Signed Language Translation},
author={Moryossef, Amit and M{\"u}ller, Mathias and G{\"o}hring, Anne and Jiang, Zifan and Goldberg, Yoav and Ebling, Sarah},
booktitle={2nd International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL)},
year={2023},
month={June},
url={https://github.com/ZurichNLP/spoken-to-signed-translation},
note={Available at: \url{https://arxiv.org/abs/2305.17714}}
}