Pinned Repositories
2D_marker_detection_using_convolutional_layers
It contains 2D marker detection using convolutional layers and pooling layers.
AM_with_GAN_for_melspectrogram
This repository is to introduce the application of Activation Maximization for audio-domain data.
gonken-lesson
Lessons provided in Gonsalves Laboratory
GonKen-Lesson_Sho
It contains the lessons I created for Gonsalves AI laboratory.
Latent_Conditional_GAN
This repository is to introduce our research, LCGAN.
MacST-project-page
research_blog
声フェチ野郎の音声生成録(https://shinshoji01.hatenablog.com/) で紹介してるソースコード
Style-Restricted_GAN
This repository is to introduce our model, Style-Restricted GAN.
Text-Hierarchical-ED
This is an official implementation of our paper published in ICASSP 2024.
text2speech-website
This repository contains the implementation of the website with speech synthesis.
shinshoji01's Repositories
shinshoji01/Text-Hierarchical-ED
This is an official implementation of our paper published in ICASSP 2024.
shinshoji01/Latent_Conditional_GAN
This repository is to introduce our research, LCGAN.
shinshoji01/MacST-project-page
shinshoji01/alpaca-lora
Instruct-tune LLaMA on consumer hardware
shinshoji01/dbViz
The official PyTorch implementation - Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective (CVPR'22).
shinshoji01/Docker
shinshoji01/AN-SSDT-Demo
shinshoji01/beaqlejs
*BeaqleJS* provides a framework to create browser based listening tests and is purely based on open web standards like HTML5 and Javascript.
shinshoji01/emotion2vec
Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
shinshoji01/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
shinshoji01/GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
shinshoji01/Hierarchical-ED-Demo
shinshoji01/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
shinshoji01/MacST-Demo
shinshoji01/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
shinshoji01/nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC
shinshoji01/SECap
shinshoji01/seq2seq-EVC
This is the implementation of our Interspeech 2021 paper: Limited data emotional voice conversion leveraging text-to-speech: two-stage sequence-to-sequence training.
shinshoji01/seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
shinshoji01/sho_util
shinshoji01/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
shinshoji01/SpeechGPT
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities.
shinshoji01/Tacotron-pytorch
Tacotron series TTS model implemented with Pytorch
shinshoji01/tacotron2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
shinshoji01/Text-Sequential-ED-Demo
shinshoji01/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
shinshoji01/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
shinshoji01/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
shinshoji01/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
shinshoji01/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning large models (InternLM, Llama, Baichuan, Qwen, ChatGLM)