ddlBoJack
Ph.D. student @X-LANCE Research Intern @BytedanceSpeech | Prev: @megvii-research @microsoft @alibaba-damo-academy
Shanghai Jiao Tong University
Pinned Repositories
Awesome-Speech-Generation
Paper, Code and Statistics for Speech Generatation.
Awesome-Speech-Language-Model
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
Awesome-Speech-Pretraining
Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.
emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
HDRR
Official implementation for Hierarchical Deep Residual Reasoning for Temporal Moment Localization
MT4SSL
[INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Multimodal_Visualization_Framework
一款时域语言定位可视化框架
Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
EmoBox
[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
ddlBoJack's Repositories
ddlBoJack/emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
ddlBoJack/Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
ddlBoJack/Awesome-Speech-Pretraining
Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.
ddlBoJack/Awesome-Speech-Language-Model
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
ddlBoJack/MT4SSL
[INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
ddlBoJack/Awesome-Speech-Generation
Paper, Code and Statistics for Speech Generatation.
ddlBoJack/pre-train-dockerfile
An Intro to set up your Speech Docker environment and debug using VSCode
ddlBoJack/CS-BAOYAN-2022
计算机保研交流群(QQ群号:605176069)
ddlBoJack/ddlBoJack.github.io
ddlBoJack/alpaca-lora
Instruct-tune LLaMA on consumer hardware
ddlBoJack/amlt
A repo for amlt examples.
ddlBoJack/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
ddlBoJack/Awesome-Video-Grounding
A reading list of papers about Visual Grounding.
ddlBoJack/CSLabInfo2022
关于2022年CS保研实验室/导师招生广告的汇总。欢迎想要打广告的小伙伴积极pr,资瓷一下互联网精神吼不吼啊?
ddlBoJack/CSSummerCamp2022
关于2022年CS保研夏令营通知公告的汇总。欢迎大家积极分享夏令营信息,资瓷一下互联网精神吼不吼啊?
ddlBoJack/ddlBoJack
ddlBoJack/dynamic-superb
The official repository of Dynamic-SUPERB.
ddlBoJack/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
ddlBoJack/FastHuBERT
ddlBoJack/FunASR
A Fundamental End-to-End Speech Recognition Toolkit
ddlBoJack/Large-Audio-Models
Keep track of big models in audio domain, including speech, singing, music etc.
ddlBoJack/llama
Inference code for LLaMA models
ddlBoJack/Llama-X
Open Academic Research on Improving LLaMA to SOTA LLM
ddlBoJack/MovieChat
🎬💭 chat with over 10K frames of video!
ddlBoJack/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
ddlBoJack/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
ddlBoJack/T2A
Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023
ddlBoJack/team-learning-program
主要存储Datawhale组队学习中“编程、数据结构与算法”方向的资料。
ddlBoJack/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
ddlBoJack/whisper
Robust Speech Recognition via Large-Scale Weak Supervision