kojunseo's Stars
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
wandb/openui
OpenUI let's you describe UI using your imagination, then see it rendered live.
state-spaces/mamba
Mamba SSM architecture
Lightning-AI/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
dair-ai/ML-Papers-of-the-Week
🔥Highlighting the top ML papers every week.
axolotl-ai-cloud/axolotl
Go ahead and axolotl questions
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
vikhyat/moondream
tiny vision language model
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
OpenSignLabs/OpenSign
🔥 The free & Open Source DocuSign alternative
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
evilsocket/cake
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
hila-chefer/Transformer-Explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Vchitect/LaVie
[IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
Parskatt/RoMa
[CVPR 2024] RoMa: Robust Dense Feature Matching; RoMa is the robust dense feature matcher capable of estimating pixel-dense warps and reliable certainties for almost any image pair.
facebookresearch/audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
praeclarum/transformers-js
Browser-compatible JS library for running language models
openNAMU/openNAMU
여러 기능이 있는 위키 엔진 (Wiki engine with multiple functions)
hayeong0/DDDM-VC
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)
Benny-Nottonson/voodoo
A working machine learning framework in pure Mojo 🔥
NTT123/light-speed
A modified VITS that utilizes phoneme duration's ground truth for better robustness
gau-nernst/learn-cuda
Learn CUDA with PyTorch
kojunseo/mojo-wav
Native wav file loading for the Mojo🔥
yujingaya/hangul
Utilities to manipulate hangul syllables
kojunseo/Max-Engine-Test
This repository is made for simply testing Pytorch and MAX(Modular) Engine
raondata/AudioPlaza
AudioPlaza: Audio Preprocessing Engine