IamQisir
An idiot with a plan can beat a genius without a plan.
The University of TokyoKashiwa, Chiba, Japan
IamQisir's Stars
dair-ai/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
typst/typst
A new markup-based typesetting system that is powerful and easy to learn.
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
d2l-ai/d2l-en
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
neonbjb/tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
OpenTalker/SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
zauberzeug/nicegui
Create web-based user interfaces with Python. The nice way.
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
holoviz/panel
Panel: The powerful data exploration & web app framework for Python
antgroup/echomimic
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
serp-ai/bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
anyoptimization/pymoo
NSGA2, NSGA3, R-NSGA3, MOEAD, Genetic Algorithms (GA), Differential Evolution (DE), CMAES, PSO
antgroup/echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
molvqingtai/WebChat
💬 Chat with anyone on any website.
JohnSnowLabs/nlu
1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
okld/streamlit-elements
Create a draggable and resizable dashboard in Streamlit, featuring Material UI widgets, Monaco editor (Visual Studio Code), Nivo charts, and more!
idiap/coqui-ai-TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
JarodMica/ai-voice-cloning
initialneil/SplattingAvatar
[CVPR2024] Official implementation of SplattingAvatar.
reservoirpy/reservoirpy
A simple and flexible code for Reservoir Computing architectures like Echo State Networks
okld/streamlit-player
A streamlit component to embed video and music players from various websites.
bouzidanas/streamlit-float
A simple module for fixing the vertical position of Streamlit containers relative to viewport instead of page or content
daswer123/deepspeed-windows-wheels
A collection of compiled wheels for deepspeed built for python 3.10 and 3.11 with support for cuda 11.8 and 12.1 for Windows
gerazov/PySFC
Python implementation of the SFC intonation model.
Bomingmiao/NoiseDiffusion
Noise Diffusion for Enhancing Faithfulness in Text-to-Image Synthesis
Nexdata-AI/207-Hours-Japanese-Speaking-English-Speech-Data-by-Mobile-Phone
Japanese Speaking English Speech Dataset