ga642381
:fist_raised: Ph.D. student @ NTU :fist_raised: Research Scientist Intern @ Meta
National Taiwan University (NTU)Taipei, Taiwan
Pinned Repositories
AudioCodec-Hub
AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models
FastSpeech2
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:
ML2021-Spring
**Official** 李宏毅 (Hung-yi Lee) 機器學習 Machine Learning 2021 Spring
RobustVC
**ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degradation / adversarial robustness of VC models.
Speech-Prompts-Adapters
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
SpeechGen
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
SpeechPrompt
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speech processing with prompting paradigm
SpeechPrompt-v2
《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm
Taiwanese-Whisper
fine-tune Whipser model for Taiwanese speech recognition
ga642381's Repositories
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
ga642381/ML2021-Spring
**Official** 李宏毅 (Hung-yi Lee) 機器學習 Machine Learning 2021 Spring
ga642381/Speech-Prompts-Adapters
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
ga642381/SpeechPrompt
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speech processing with prompting paradigm
ga642381/FastSpeech2
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:
ga642381/SpeechPrompt-v2
《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm
ga642381/SpeechGen
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
ga642381/Taiwanese-Whisper
fine-tune Whipser model for Taiwanese speech recognition
ga642381/RobustVC
**ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degradation / adversarial robustness of VC models.
ga642381/AudioCodec-Hub
AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models
ga642381/Taiwanese-Speech-Synthesis
Taiwanese Speech Synthesis with Tacotron2
ga642381/Taiwanese-Translation
Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus
ga642381/FlappyBird
:fire: Super Flappy Bird in p5.js
ga642381/moth
虫我研所 Moth Institute 新一代設計展 https://ga642381.github.io/moth
ga642381/TaiwaneseTTS
ga642381/Kai-Wei-Chang-Talks
A repository sharing slides of the talks I gave
ga642381/FinanceWeb
ga642381/CA2021-Final
ga642381/neurips2021-sas-react
ga642381/S2VC
ga642381/seamless_communication_emo
Foundational Models for State-of-the-Art Speech and Text Translation
ga642381/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
ga642381/speech-language-model
A collection of papers related to speech language models
ga642381/speech_quality
ga642381/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese and Easy to adapt for other languages)
ga642381/AudioDec
An Open-source Streaming High-fidelity Neural Audio Codec
ga642381/awesome-llm-role-playing-with-persona
Awesome-llm-role-playing-with-persona: a curated list of resources for large language models for role-playing with assigned personas
ga642381/Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
ga642381/Linguistics-111
ga642381/vision
Datasets, Transforms and Models specific to Computer Vision