Pinned Repositories
ISSAI_SAIDA_Kazakh_ASR
the first industrial-scale open-source Kazakh speech corpus. KSC2 corpus subsumes the previously introduced two corpora: KSC and KazakhTTS2 and supplements additional data from other sources. KSC2 contains around 1.2k hours of high-quality transcribed data comprising over 600k utterances.
kaz-image-captioning
ExpansionNet v2 model trained on the COCO dataset with captions translated into Kazakh
Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis corpus. In KazakhTTS2, the overall size has increased from 93 hours to 271 hours, the number of speakers has risen from two to five (three females and two males), and the topic coverage has been diversified.
KazEmoTTS
An open-source Kazakh Emotional Text-to-Speech Dataset
KazNERD
An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.
SpeakingFaces
A large-scale publicly-available visual-thermal-audio dataset designed to encourage research in the general areas of user authentication, facial recognition, speech recognition, and human-computer interaction.
TFW
TFW: Annotated Thermal Faces in the Wild Dataset
thermal-facial-landmarks-detection
SF-TL54: Thermal Facial Landmark Dataset with Visual Pairs.
TurkicASR
A multilingual ASR model that can recognize ten Turkic languages—Azerbaijani, Bashkir, Chuvash, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek.
TurkicTTS
A multilingual text-to-speech synthesis system for ten lower-resourced Turkic languages: Azerbaijani, Bashkir, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Turkmen, Uyghur, and Uzbek.
ISSAI's Repositories
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis corpus. In KazakhTTS2, the overall size has increased from 93 hours to 271 hours, the number of speakers has risen from two to five (three females and two males), and the topic coverage has been diversified.
IS2AI/SpeakingFaces
A large-scale publicly-available visual-thermal-audio dataset designed to encourage research in the general areas of user authentication, facial recognition, speech recognition, and human-computer interaction.
IS2AI/TurkicASR
A multilingual ASR model that can recognize ten Turkic languages—Azerbaijani, Bashkir, Chuvash, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek.
IS2AI/KazEmoTTS
An open-source Kazakh Emotional Text-to-Speech Dataset
IS2AI/KazNERD
An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.
IS2AI/OpenThermalPose
An Open-Source Annotated Thermal Human Pose Dataset
IS2AI/Central-Asian-Food-Dataset
42 food classes from Kazakh National and Central Asian cuisine
IS2AI/faces-in-event-streams
This repo contains code and instructions for the detection of faces in event streams
IS2AI/Kazakh_ASR
IS2AI/IMUWiFine
IS2AI/Soyle
IS2AI/Kazakh-Speech-Commands-Dataset
Kazakh Speech Commands Dataset
IS2AI/KazQAD
An open-source Kazakh Question Answering Dataset
IS2AI/KazLLM_Benchmark
IS2AI/AnyFacePP
IS2AI/visual_assistant
A visual assistant system for blind people.
IS2AI/city-identification
This repo contains dataset and models for city classification
IS2AI/city-sustainability-indexes
This repo contains code and models for detecting city sustainability indexes
IS2AI/Common-Objects-in-Hemispherical-Images-Dataset
39 classes of objects sampled from the MS COCO dataset captured with a hemispherical/fisheye camera
IS2AI/TatarTTS
TatarTTS: An Open-Source Text-to-Speech Synthesis Dataset for the Tatar Language
IS2AI/Central_Asian_Food_Scenes_Dataset
This is the repository for the Central Asian Food Scenes Dataset
IS2AI/talk-llm
Talk with ChatGPT
IS2AI/construction-sites-detection
This repo contains code and dataset for training and testing ml model which implements instance segmentation of construction sites
IS2AI/Global-Gastronomic-Culinary-Dataset
IS2AI/Keyword-MLP-LangID
IS2AI/MMHA-28
MMHA-28: Human Action Recognition Across RGB, Depth, Thermal, and Event Modalities
IS2AI/Multilingual-Speech-Command-Recognition
IS2AI/multispectral-motion-analysis
IS2AI/oylan_car_demo
IS2AI/TatarSCR
An Open-Source Speech Commands Dataset for the Tatar Language