Sign-lanuage-datasets

These datasets are used for machine-learning research

Todo

WLASL: A large-scale dataset for Word-Level American Sign Language

Sign language datasets

Co: Country
Class: Classes
Subj: Subjects
LL: Language level(W-Word,S-Sentence,H-Handshape)
Type: Type(V-Video, VR-Video(RGB), VD-Video(depth))
An: Annotations
Av: Availability(CA-Contact Author, PA-Publicly Available, Un-Unknown, Non-Non available)
T: There is in our hard drive?(Y-Yes, N-No)

id	Dataset name	Co	Class	Subj	Samples	Data	LL	Type	An	Av	T
1	DGS Kinect 40	Ger	40	15	3000		W	V,[9]		PA	Y
2	RWTH-PHOENIX-Weather	Ger	1200	9	45760	52gb	S	V	[18]	PA	Y
3	SIGNUM	Ger	450	25	33210	920gb	S	V		PA,[5]	N
4	GSL 20	Gre	20	6	~840		W			CA	Y
5	Boston ASL LVD	USA	3300+	6	9800		W	V,[9]	[19,20]	PA	N
6	PSL Kinect 30	Pol	30	1	30×10=300	~1.2gb	W	V,[10]		PA	Y
7	PSL ToF 84	Pol	84	1	84×20=1680	~33gb	W	V,[11]		PA	N
8	PSL 101	Pol	?	?	?	?	?	?		CA	N
9	LSA64	Arg	64	10	3200	20gb	W	VR	[21]	PA	Y
10	BosphorusSign	Tur								Non	N
11	MSR Gesture 3D	USA	12	10	336	28mb	W	VD		PA	N
12	DEVISIGN-G	Chi	36[1]	8	432	?	W	VR		CA	N
13	DEVISIGN-D	Chi	500	8	6000	?	W	VR		CA	N
14	DEVISIGN-L	Chi	2000	8	24000	?	W	VR		CA	N
15	IIITA -ROBITA	Ind	23	?		284mb	W	VR,[15]		CA	N
16	Purdue ASL	USA	?	14[3]	?	?	W/S	V,[14]		[6]	N
17	CUNY ASL	USA	?	8	~33000[4]	?	S	VR,[16]	[7]	U	N
18	SignsWorld Atlas	Ara	[2]	10	?	?	W,S,H	V,[17,14]	?	U	N

[1] - letters/numbers; [2] - multiple types; [3] - only 5 available; [4] - glosses; [5] - 1TB, contact author to obtain hard drive; [6] - Request DVDs/HD; [7] - Signstream; [8] - ?; [9] - multiple angles; [10]- depth from Kinect camera; [11]- ToF camera; [12]- ?; [13]- ?; [14]- RGB; [15]- 320x240; [16]- mocap data; [17]- Images; [18]- Face, hand, end/start(unfinished); [19]- Hand; [20]- end/start; [21]- Hands and Head position; [22]- only ASL fingerspelling sequences.

Dataset information and related papers

DGS Kinect 40 - German Sign Language
1. Sign Language Recognition using Sub-Units, 2012, Cooper et al.
2. Sign Language Recognition using Sequential Pattern Trees 2012, Ong et al.
3. Sign Spotting using Hierarchical Sequential Patterns with Temporal Intervals 2014, Ong et al.
RWTH-PHOENIX v1 - German Sign Language RWTH-PHOENIX v2
1. Dataset paper 2012, Forster et al.
2. Dataset extensions paper 2014, Forster et al
3. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers 2015, Koller et al.
4. May the force be with you: Force-aligned signwriting for automatic subunit annotation of corpora 2013, Koller et al.
5. Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition
SIGNUM - German Sign Language
1. Rapid Signer Adaptation for Continuous Sign Language Recognition Using a Combined Approach of Eigenvoices, MLLR, and MAP 2008, U. von Agris, C. Blömer, K.-F. Kraiss.
2. The Significance of Facial Features for Automatic Sign Language Recognition 2008, U. von Agris, M. Knorr, K.-F. Kraiss.
3. Towards a Video Corpus for Signer-Independent Continuous Sign Language Recognition 2007, U. von Agris, K.-F. Kraiss
4. Rapid Signer Adaptation for Isolated Sign Language Recognition 2006, U. von Agris, D. Schneider, J. Zieren, K.-F. Kraiss.
5. Advanced Man-Machine Interaction. Fundamentals and Implementation K.-F. Kraiss, ed.
6. Recent Developments in Visual Sign Language Recognition 2008, U. von Agris, J. Zieren, U. Canzler, B. Bauer, K.-F. Kraiss.
7. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers 2015, Koller et al.
Greek Sign Language (no website)
1. Sign Language Recognition using Sub-Units, 2012, Cooper et al.
2. Sign Language Recognition using Sequential Pattern Trees 2012, Ong et al.
3. Sign Spotting using Hierarchical Sequential Patterns with Temporal Intervals 2014, Ong et al.
Boston ASLLVD - American Sign Language
1. Exploiting Phonological Constraints for Handshape Inference in ASL Video 2011, Thangali et al.
2. A New Framework for Sign Language Recognition based on 3D Handshape Identification and Linguistic Modeling 2014 - Dilsizian - 84% accuracy
PSL Kinect 30 - Polish Sign Language
1. Polish sign language words recognition with Kinect 2013, Oszust et al.
2. Some Approaches to Recognition of Sign Language Dynamic Expressions with Kinect 2014, Oszust et al.
3. Recognition of Hand Gestures Observed by Depth Cameras 2015, Kapuscinski et al.
PSL ToF 84 - Polish Sign Language
1. Polish sign language words recognition with Kinect 2013,Oszust et al.
2. Recognition of Hand Gestures Observed by Depth Cameras 2015, Kapuscinski et al.
PSL 101 - Polish Sign Language (no website)
1. Modelling and Recognition of Signed Expressions Using Subunits Obtained by Data–Driven Approach 2012, Oszust et al.
LSA64 Argentinian Sign Language
1. LSA64: an Argentinian Sign Language Dataset
2. Sign Languague Recognition Without Frame-Sequencing Constraints: A Proof of Concept on the Argentinian Sign Language
3. Dynamic Gesture Recognition and its Application to Sign Language 2017, Ronchetti
4. SIGN LANGUAGE RECOGNITION BASED ON HAND AND BODY SKELETAL DATA 2017,Konstantinidis et al.
5. Real-Time Sign Language Gesture (Word) Recognition from Video Sequences Using CNN and RNN 2018, Masood et al.
Turkish sign language dataset
MSR Gesture 3D - ASL Download site
1. Action Recognition from Depth Sequences Using Weighted Fusion of 2D and 3D Auto-Correlation of Gradients Features 2016, Chen et al
DEVISIGN G
DEVISIGN D
DEVISIGN L
IIITA -ROBITA Indian Sign Language Gesture Database
1. Recognizing & Interpreting Indian Sign Language Gesture for Human Robot Interaction 2010, Nandy et al.
2. Recognition of Isolated Indian Sign Language gesture in Real Time 2010, Nandy et al.
Purdue ASL Dataset
CUNY ASL Dataset for Animation
1. Collecting and evaluating the CUNY ASL corpus for research on American Sign Language animation
SignsWorld Atlas; a benchmark Arabic Sign Language database

Datasets Handshape features (Handshape/hand posture datasets) but not all are for sign language

id	Name	Co	Clas	Sub	Samples	Data	Type	Availability
1	ASL Fingerspelling A	USA	24	5	131000	2.1gb	images (depth+rgb)	Free download
2	ASL Fingerspelling B	USA	24	9		317mb	images (depth)	Free download
3	LSA16 handshapes	Arg	16	10	800	7mb	images (rgb)	Free download
4	PSL Fingerspelling ToF	Pol	16	3	960	~290mb	3D point cloud	Free download
5	ISL	Iri	[23]	6	[24]	170mb	segmented images	Free download
6	RWTH-PHOENIX-Weather Handshapes	Ger	60		[25]	+ 17gb	Hand Images (rgb)	Free download
7	Japanese Fingerspelling Dataset	Jap	41	10	8055	4.5mb	[26]	Free download
8	NUS hand posture dataset I	Sin	10	?	240	3mb	images(rgb),160x120	Free download
9	NUS hand posture dataset II	Sin	10	40	2000	73mb	images(rgb)160x120	Free download
10	CIARP	-	10	?	6000	11mb	images(rgb)38x38	Free download
11	RTWH Fingerspelling dataset	Ger
12	Indian Kinect	Ind	40	18	5041	2gb	[27]	Free download
13	[ArASL]	Ara	32	?	54,049	64mb	images(rgb)	Free download
14	ChicagoFSWild	USA	[2]	160	?		images(rgb)	Free download
15	ChicagoFSWild+	USA

[ArASL] - Arabic Alphabets Sign Language Dataset; [2] - multiple types; [23]- 23 static + 3 dynamic; [24]- 58114 frames/468 videos; [25]- 3359 labelled + 17gb unlabeled [26]- segmented images (rgb), 32x32 [27]- images (rgb+depth) 640x480

Dataset information and related papers

ASL Fingerspelling
1. Spelling It Out: Real-Time ASL Fingerspelling Recognition. 2011, Pugeault et al.
2. Recognition of Hand Gestures Observed by Depth Cameras. 2015, Kapuscinski et al.
PSL Fingerspelling ToF
1. Recognition of Hand Gestures Observed by Depth Cameras. 2015, Kapuscinski et al.
LSA16 handshapes
1. Handshape recognition for Argentinian Sign Language using ProbSom. 2016, Ronchetti et al.
2. A Study of Convolutional Architectures for Handshape Recognition applied to Sign Language 2017, Quiroga et al.
ISL Irish Sign Language Letters.
1. A Dataset for Irish Sign Language Recognition 2017, Oliveira et al.
2. A comparison between end-to-end approaches and feature extraction based approaches for Sign Language recognition 2017, Oliveira et al.
RWTH-PHOENIX-Weather 2014 MS Handshapes
1. Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data Is Continuous and Weakly Labelled 2017, Koller et al.
Japanese Sign Language Dataset
1. Recognition of JSL Finger Spelling Using Convolutional Neural Networks 2017, Hosoe, Sako and Kwolek
2. Learning Siamese Features for Finger Spelling Recognition 2017, Sako and Kwolek
NUS hand posture dataset I
1. Hand posture and face recognition using a Fuzzy-Rough Approach 2010, Pramod Kumar P, Prahlad Vadakkepat, and Loh Ai Poh
2. Hand Posture Recognition Using Convolutional Neural Network
NUS hand posture dataset II
1. Attention Based Detection and Recognition of Hand Postures Against Complex Backgrounds 2013, Pisharady et al
CIARP 2017
1. Hand Posture Recognition Using Convolutional Neural Network
RTWH Fingerspelling dataset
Modeling Image Variability in Appearance-Based Gesture Recognition. In ECCV Workshop on Statistical Methods in Multi-Image and Video Processing
Indian Kinect github
Nearest neighbour classification of Indian sign language gestures using kinect camera 2016, Ansari and Harit
Arabic Alphabets Sign Language Dataset (ArASL)
Arabic Alphabet and Numbers Sign Language Recognition
ChicagoFSWild
1. American Sign Language fingerspelling recognition in the wild
2. Fingerspelling recognition in the wild with iterative visual attention
ChicagoFSWild+
1. Fingerspelling recognition in the wild with iterative visual attention

Continuous hand pose

NYU Hand pose dataset
1. Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks

Datasets of facial features

Datasets of lip reading features

The table from the paper - LipNet: End-to-End Sentence-level Lipreading

Method	Dataset	Size	Output	Accuracy
Fu et al. (2008)	AVICAR	851	Digits	37.9%
Hu et al. (2016)	AVLetter	78	Alphabet	64.6%
Papandreou et al. (2009)	CUAVE	1800	Digits	83.0%
Chung & Zisserman (2016a)	OuluVS1	200	Phrases	91.4%
Chung & Zisserman (2016b)	OuluVS2	520	Phrases	94.1%
Chung & Zisserman (2016a)	BBC TV	> 400000	Words	65.4%
Gergen et al. (2016)	GRID	29700	Words	86.4%
LipNet	GRID	28775	Sentences	95.2%
Datasets of emotion reading features

Предобученные модели распознавания эмоций EmoPy выложили в открытый доступ Neurohive
F2ED: датасет для распознавания эмоций на лице https://neurohive.io/ru/novosti/f2ed-dataset-dlya-raspoznavaniya-emocij-na-lice/?fbclid=IwAR3krXoMfAJySGuZAQsVkDwPoNIfex44EgLvDJCK5-24kX9hhVYzV_7WS4E

Other info Kevin Murphy mantains a similar list for Action Recognition Datasets. Other similar websites with sign language dataset compilations are:

Papers that cite datasets that are unavailable:

480 signs, Indian Sign Language
- Segment, Track, Extract, Recognize and Convert Sign Language Videos to Voice/Text 2012, Kishore and Kumar
- Selfie video based continuous Indian sign language recognition system 2017, Rao and Kishore
10 signs, indian sign language
- Recognizing & interpreting Indian Sign Language gesture for Human Robot Interaction 2010, Nandy et al.
24 static handshapes, Indian Sign Language
- Recognition of Indian Sign Language in Live Video 2013, Singha and Das Hand movement datasets (movement only):

LIBRAS hand movement
1. Hand Movement Recognition for Brazilian Sign Language: A Study Using Distance-Based Neural Networks.

References

TODO

https://github.com/Nikhilkohli1/Real-Time-Interaction-Using-Sign-Language

human2b/sign-lanuage-datasets

Sign-lanuage-datasets

TODO