vicident

Speech Technology Center, Kuznech Ltd, LG Electronics, VK.COM, ID R&D

vicident's Stars

RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python43.2k 236 1.6k4.8k
slatedocs/slate
Beautiful static documentation for your API
Language:SCSS36.1k 503 609252
RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
Language:Python28.2k 191 1.8k4k
mozilla/DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Language:C++26.1k 676 2.1k4k
PaddlePaddle/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
Language:C++22.6k 715 18.7k5.7k
w-okada/voice-changer
リアルタイムボイスチェンジャー Realtime Voice Changer
Language:Python17.6k 134 1.1k1.9k
NVIDIA/nvidia-docker
Build and run Docker containers leveraging NVIDIA GPUs
17.3k 453 1.6k2k
ExistentialAudio/BlackHole
BlackHole is a modern macOS audio loopback driver that allows applications to pass audio to other applications with zero additional latency.
Language:C16.1k 126 405620
khangich/machine-learning-interview
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
10.5k 226 41.7k
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python7.4k 57 842566
apache/zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Language:Java6.5k 305 02.8k
sigoden/aichat
All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
Language:Rust6.2k 47 481401
plaidml/plaidml
PlaidML is a framework for making deep learning work everywhere.
Language:C++4.6k 155 602397
mindee/doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Language:Python4.5k 42 401489
andabi/deep-voice-conversion
Deep neural networks for voice conversion (voice style transfer) in Tensorflow
Language:Python3.9k 160 128844
DeviceFarmer/stf
Control and manage Android devices from your browser.
Language:JavaScript3.7k 76 287511
Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor
Web labeling tool for bitmap images and point clouds
Language:JavaScript1.9k 60 166440
Yuliang-Liu/Monkey
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Language:Python1.7k 22 155121
stypr/clubhouse-py
Clubhouse API written in Python. Standalone client included. For reference and education purposes only.
Language:Python1.7k 87 18289
Calamari-OCR/calamari
Line based ATR Engine based on OCRopy
Language:Python1.1k 53 275212
toy/blueutil
CLI for bluetooth on OSX: power, discoverable state, list, inquire devices, connect, info, …
Language:Objective-C1.1k 14 9154
YuvalNirkin/fsgan
FSGAN - Official PyTorch Implementation
Language:Jupyter Notebook773 27 172149
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Language:Python753 25 51128
hyperledger-archives/indy-sdk
indy-sdk
Language:Rust671 69 272736
markovka17/dla
Deep learning for audio processing
Language:Jupyter Notebook629 25 4111
r9y9/gantts
PyTorch implementation of GAN-based text-to-speech synthesis and voice conversion (VC)
Language:Jupyter Notebook516 38 44115
andimarafioti/florence2-finetuning
Quick exploration into fine tuning florence 2
Language:Jupyter Notebook305 4 2428
hujinsen/StarGAN-Voice-Conversion
full tensorflow implementation of the paper: StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks https://arxiv.org/abs/1806.02169
Language:Python272 15 3054
HamadYA/GhostFaceNets
This repository contains the official implementation of GhostFaceNets, State-Of-The-Art lightweight face recognition models.
Language:Python226 4 5939
webrtcHacks/WebRTC-Camera-Resolution
WebRTC Camera Resolution Finder
Language:JavaScript130 8 240

vicident

vicident's Stars

RVC-Boss/GPT-SoVITS

slatedocs/slate

RVC-Project/Retrieval-based-Voice-Conversion-WebUI

mozilla/DeepSpeech

PaddlePaddle/Paddle

w-okada/voice-changer

NVIDIA/nvidia-docker

ExistentialAudio/BlackHole

khangich/machine-learning-interview

OpenGVLab/InternVL

apache/zeppelin

sigoden/aichat

plaidml/plaidml

mindee/doctr

andabi/deep-voice-conversion

DeviceFarmer/stf

Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor

Yuliang-Liu/Monkey

stypr/clubhouse-py

Calamari-OCR/calamari

toy/blueutil

YuvalNirkin/fsgan

gabrielmittag/NISQA

hyperledger-archives/indy-sdk

markovka17/dla

r9y9/gantts

andimarafioti/florence2-finetuning

hujinsen/StarGAN-Voice-Conversion

HamadYA/GhostFaceNets

webrtcHacks/WebRTC-Camera-Resolution