Pinned Repositories
3dv_tutorial
An Invitation to 3D Vision: A Tutorial for Everyone
Algorithm
记录一些常用算法的实现(涵盖常用的数据结构,机器学习以及语音识别中常用算法)
Applio
A simple, high-quality voice conversion tool focused on ease of use and performance.
learning-dl
learning and understanding deep learning
mpt-demo
Cross-Camera Multiple Person Tracking Demo for CVPR-2017
mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
sharing
This is a personal repository for sharing some experiences about machine learning and data mining.
xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow
i-MaTh's Repositories
i-MaTh/Algorithm
记录一些常用算法的实现(涵盖常用的数据结构,机器学习以及语音识别中常用算法)
i-MaTh/Applio
A simple, high-quality voice conversion tool focused on ease of use and performance.
i-MaTh/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
i-MaTh/city_json
**城市json&港澳台、世界城市json
i-MaTh/cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
i-MaTh/cs-self-learning
计算机自学指南
i-MaTh/CVQ-VAE
[ICCV 2023] Online Clustered Codebook
i-MaTh/dclm
DataComp for Language Models
i-MaTh/flux
Official inference repo for FLUX.1 models
i-MaTh/friendly-stable-audio-tools
Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stability AI.
i-MaTh/HiFTNet
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
i-MaTh/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
i-MaTh/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
i-MaTh/mean-opinion-score
Python library for calculating the mean opinion score and 95% confidence interval of the standard deviation of text-to-speech ratings according to Ribeiro et al. (2011).
i-MaTh/mini-omni
open-source multimodel large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
i-MaTh/Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
i-MaTh/NCE
Yingshi New Concept English
i-MaTh/openai-python
The official Python library for the OpenAI API
i-MaTh/OpenDiloco
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training
i-MaTh/Parrot-TTS
Official Code for ParrotTTS
i-MaTh/PerceptiveAgent
Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))
i-MaTh/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
i-MaTh/RAVE
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder
i-MaTh/shell_tools
i-MaTh/speech-resynthesis
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.
i-MaTh/spiritlm
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
i-MaTh/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
i-MaTh/voice-chat-pdf
Use OpenAI's realtime API for a chatting with your documents
i-MaTh/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.
i-MaTh/WavChat
A Survey of Spoken Dialogue Models (60 pages)