mayank-git-hub

I am currently studying Electrical Engineering at IIT Bombay. I am interested in Machine Learning specifically in combining audio & video.

Sony Research and Development JapanMumbai, India

mayank-git-hub's Stars

facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python30.3k 426 4.2k6.4k
pytube/pytube
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
Language:Python12.2k 202 1.4k2.5k
Rudrabha/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Language:Python10.6k 170 6612.3k
abraunegg/onedrive
OneDrive Client for Linux
Language:D10k 111 1.2k859
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Language:Python4.2k 49 236411
mosaicml/llm-foundry
LLM training code for Databricks foundation models
Language:Python4k 47 382524
facebookresearch/LASER
Language-Agnostic SEntence Representations
Language:Jupyter Notebook3.6k 89 211461
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
Language:Python2.6k 42 96251
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python1.9k 31 162506
graykode/gpt-2-Pytorch
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
Language:Python969 27 18226
Kyubyong/g2p
g2p: English Grapheme To Phoneme Conversion
Language:Python804 19 25129
samc621/SneakerBot
All-in-one bot, with auto captcha-solving and proxy management, using Node.js and Puppeteer.
Language:JavaScript747 54 60193
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Language:Python668 25 46117
DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
635 88 442
swesterfeld/audiowmark
Audio Watermarking
Language:C++387 14 5077
sony/ctm
Language:Python227 18 712
AI4Bharat/IndicTrans2
Translation models for 22 scheduled languages of India
Language:Python224 10 8861
wavmark/wavmark
AI-based Audio Watermarking Tool
Language:Python219 9 1529
aliutkus/torchinterp1d
1D interpolation for pytorch
Language:Python166 4 1919
saiteja-talluri/Speech2Face
Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face Behind a Voice by MIT CSAIL
Language:Python166 12 535
TemugeB/python_stereo_camera_calibrate
Stereo camera calibration with python and openCV
Language:Python160 3 1737
CHerSun/NoSleep
Lightweight Windows utility to prevent screen locking
Language:C#142 5 723
HSU-ANT/gstpeaq
GstPEAQ - A GStreamer plugin for Perceptual Evaluation of Audio Quality (PEAQ)
Language:C64 8 1924
AI4Bharat/IndicNLP-Transliteration
Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/IndicXlit
Language:Python58 4 514
KimythAnly/qqdm
A lightweight, fast and pretty progress bar for Python
Language:Python40 3 12
mayank-git-hub/Text-Recognition
Text Recognition and Detection based on Pixel-Link paper implemented in pytorch
Language:Python28 8 58
dbigham/ARC
Abstraction and Reasoning Corpus
Language:Mathematica14 4 00
onedrivejs/onedrive
Cross-platform OneDrive client written in JavaScript for node.js
Language:JavaScript7 2 63
abdelmaged/anime-dl
CLI to download anime episodes from anime websites like GoGoAnime
Language:Python20
IshwaryaAnant/codec-perceptual-loss
Code accompanying our submission to ACM MM on a codec-inspired perceptual loss function
Language:HTML1 2 0

mayank-git-hub

mayank-git-hub's Stars

facebookresearch/fairseq

pytube/pytube

Rudrabha/Wav2Lip

abraunegg/onedrive

snakers4/silero-vad

mosaicml/llm-foundry

facebookresearch/LASER

Stability-AI/stable-audio-tools

jik876/hifi-gan

graykode/gpt-2-Pytorch

Kyubyong/g2p

samc621/SneakerBot

gabrielmittag/NISQA

DmitryRyumin/INTERSPEECH-2023-24-Papers

swesterfeld/audiowmark

sony/ctm

AI4Bharat/IndicTrans2

wavmark/wavmark

aliutkus/torchinterp1d

saiteja-talluri/Speech2Face

TemugeB/python_stereo_camera_calibrate

CHerSun/NoSleep

HSU-ANT/gstpeaq

AI4Bharat/IndicNLP-Transliteration

KimythAnly/qqdm

mayank-git-hub/Text-Recognition

dbigham/ARC

onedrivejs/onedrive

abdelmaged/anime-dl

IshwaryaAnant/codec-perceptual-loss