Pinned Repositories
mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
trading-strategy
unsounded
personal master degree thesis project
LiveTalking
Real time interactive streaming digital human
ComfyUI-Manager
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, this extension provides a hub feature and convenience functions to access a wide range of information within ComfyUI.
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
ComfyUI-3D-Pack
An extensive node suite that enables ComfyUI to process 3D inputs (Mesh & UV Texture, etc) using cutting edge algorithms (3DGS, NeRF, etc.)
MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
OpenVoice
Instant voice cloning by MIT and MyShell.
irrikrlla's Repositories
irrikrlla/trading-strategy
irrikrlla/unsounded
personal master degree thesis project