Pinned Repositories
realtime_object_detection
fairmotion
Tools to load, process and visualize motion capture data
ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
Android-Camera-Example
A sample android camera example
android-Camera2Basic
Migrated:
Audio2BodyDynamics
Audio To Body Dynamics, CVPR 2018
audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
AudioLDM-training-finetuning
AudioLDM training, evaluation, and finetuning.
AudioLDM2
Text-to-Audio/Music Generation
hcynomo's Repositories
hcynomo/PantoMatrix
[Speech to motion]PantoMatrix: Co-Speech Talking Head and Gestures Generation
hcynomo/ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
hcynomo/audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
hcynomo/AudioLDM-training-finetuning
AudioLDM training, evaluation, and finetuning.
hcynomo/AudioLDM2
Text-to-Audio/Music Generation
hcynomo/whisper-vits-svc
Core Engine of Singing Voice Conversion & Singing Voice Clone
hcynomo/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
hcynomo/so-vits-svc
SoftVC VITS Singing Voice Conversion
hcynomo/voice-changer
リアルタイムボイスチェンジャー Realtime Voice Changer
hcynomo/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
hcynomo/recognize
👁 👂 Smart media tagging for Nextcloud: recognizes faces, objects, landscapes, music genres
hcynomo/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
hcynomo/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
hcynomo/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
hcynomo/llama-recipes
Examples and recipes for Llama 2 model
hcynomo/Taiwan-LLaMa
Traditional Mandarin LLMs for Taiwan
hcynomo/lp-music-caps
LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]
hcynomo/MetaGPT
🌟 The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo
hcynomo/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
hcynomo/llama
Inference code for LLaMA models
hcynomo/MERT
Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".
hcynomo/CLMR
Official PyTorch implementation of Contrastive Learning of Musical Representations
hcynomo/gpt-ai-assistant
OpenAI + LINE + Vercel = GPT AI Assistant
hcynomo/PoseCameraAPI
Tools to work with the Pose Camera app
hcynomo/realtime_object_detection
hcynomo/mediapipe
hcynomo/EDGE
Official PyTorch Implementation of EDGE (CVPR 2023)
hcynomo/mlkit
A collection of sample apps to demonstrate how to use Google's ML Kit APIs on Android and iOS
hcynomo/ChatGPT-Line-Bot
This is a repository that allows you to integrate ChatGPT into Line.
hcynomo/Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)