Pinned Repositories
Coco-Nut
Coco-Nut (Corpus of connecting NIHONGO utterance and text) corpus
jsut-label
context labels and pronunciation data for JSUT corpus
jtubespeech
lightweight_spkr_anon
Lightweight speaker anonymization [IEEE SLT2021]
multi-speaker-dgp
Official implementation of DGP-based multi-speaker speech synthesis with PyTorch
tdmelodic_openjtalk
tdmelodic for open-jtalk
UTMOS22
UT-Sarulab MOS prediction system using SSL models
UTMOSv2
UTokyo-SaruLab MOS Prediction System
whisper-asr-finetune
xvector_jtubespeech
xvector model on jtubespeech
sarulab-speech's Repositories
sarulab-speech/jtubespeech
sarulab-speech/UTMOS22
UT-Sarulab MOS prediction system using SSL models
sarulab-speech/UTMOSv2
UTokyo-SaruLab MOS Prediction System
sarulab-speech/jsut-label
context labels and pronunciation data for JSUT corpus
sarulab-speech/xvector_jtubespeech
xvector model on jtubespeech
sarulab-speech/whisper-asr-finetune
sarulab-speech/lightweight_spkr_anon
Lightweight speaker anonymization [IEEE SLT2021]
sarulab-speech/multi-speaker-dgp
Official implementation of DGP-based multi-speaker speech synthesis with PyTorch
sarulab-speech/tdmelodic_openjtalk
tdmelodic for open-jtalk
sarulab-speech/Coco-Nut
Coco-Nut (Corpus of connecting NIHONGO utterance and text) corpus
sarulab-speech/spatial_voice_conversion
Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals
sarulab-speech/ml-audiocaps
Multi-lingual AudioCaps
sarulab-speech/VMC2024-sarulab-data
sarulab-speech/SaSLaW
Dialogue Speech Corpus with Audio-visual Egocentric Information, "So, what are you Speaking, Listening, and Watching?"
sarulab-speech/visual-onoma-to-wave
Visual onoma-to-wave official implementation
sarulab-speech/Mid-Attribute-Speaker-Generation
sarulab-speech/pseudo_speech_decryption
sarulab-speech/demo_CALLS_corpus
CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center (INTERSPEECH2023)
sarulab-speech/demo_ChatGPT_EDSS
ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings (INTERSPEECH2023)
sarulab-speech/bert-japanese
BERT models for Japanese text.
sarulab-speech/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
sarulab-speech/yodas-transcription
Modified transcriptions of YODAS dataset