Mikezz1's Stars
svc-develop-team/so-vits-svc
SoftVC VITS Singing Voice Conversion
yandexdataschool/Practical_RL
A course in reinforcement learning in the wild
lucidrains/x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Doubiiu/DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
lucidrains/lion-pytorch
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
homebrewltd/ichigo
Local realtime voice AI
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
adefossez/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
bytedance/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
lhotse-speech/lhotse
Tools for handling speech data in machine learning projects.
KanatnikovMax/znanie-drevnix
lucidrains/mixture-of-experts
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
lucidrains/rotary-embedding-torch
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
justinjohn0306/so-vits-svc-4.0-v2
SoftVC VITS Singing Voice Conversion
lucidrains/BS-RoFormer
Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
IvanDrokin/torch-conv-kan
This project is dedicated to the implementation and research of Kolmogorov-Arnold convolutional networks. The repository includes implementations of 1D, 2D, and 3D convolutions with different kernels, ResNet-like and DenseNet-like models, training code based on accelerate/PyTorch, as well as scripts for experiments with CIFAR-10 and Tiny ImageNet.
DmitryRyumin/ICASSP-2023-24-Papers
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
JorisCos/LibriMix
An open source dataset for source separation
sleepymalc/VSCode-LaTeX-Inkscape
✍️ A way to integrate LaTeX, VS Code, and Inkscape in macOS
ruizhecao96/CMGAN
Conformer-based Metric GAN for speech enhancement
apple/ml-sigma-reparam
Srijith-rkr/Whispering-LLaMA
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
Hypotheses-Paradise/Hypo2Trans
Single-blind supplementary materials for NeurIPS 2023 submission
alibabasglab/MossFormer
This repo provides the processed samples of the manuscript "MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-head Transformer with Convolution-augmented Joint Self-Attentions", which was submitted to ICASSP 2023.
georgygospodinov/speech_course
Deep Learning for Speech
ischurov/scientific-computing-2024
Bridging the gap between mathematical courses and ML
fattorib/fusedswiglu
Fused SwiGLU Triton kernels