wav2vec2

There are 114 repositories under wav2vec2 topic.

PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python11.2k 184 1.9k1.9k
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Language:Python2.3k 46 398486
audeering/w2v2-how-to
How to use our public wav2vec2 dimensional emotion model
Language:Jupyter Notebook459 9 1649
oliverguhr/wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Language:Python328 7 1556
pszemraj/vid2cleantxt
Python API & command-line tool to easily transcribe speech-based video files into clean text
Language:Jupyter Notebook188 4 1128
habla-liaa/ser-with-w2v2
Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
Language:Jupyter Notebook126 7 1823
khanld/ASR-Wav2vec-Finetune
:zap: Finetune Wa2vec 2.0 For Speech Recognition
Language:Python115 2 924
inboxpraveen/LLM-Minutes-of-Meeting
🎤📄 An innovative tool that transforms audio or video files into text transcripts and generates concise meeting minutes. Stay organized and efficient in your meetings, and get ready for Phase 2 where we'll be open for contributions to enable real-time meeting transcription! 🚀
Language:Python107 1 411
vietai/ASR
End-to-End Vietnamese Speech Recognition using wav2vec 2.0
93 3 29
thevasudevgupta/gsoc-wav2vec2
GSoC'2021 | TensorFlow implementation of Wav2Vec2
Language:Jupyter Notebook89 4 2229
tuanio/noisy-student-training-asr
Pytorch implementation of Noisy Student Training for Automatic Speech Recognition and Automatic Pronunciation Error Detection problem
Language:Python86 2 515
Telegram-Zalo/zac2022-lyric-alignment
Solution for Zalo AI Challenge 2022 - Lyrics Alignment
Language:Python67 2 118
lstrgar/self-supervised-phone-segmentation
Phoneme segmentation using pre-trained speech models
Language:Python54 5 510
vectominist/MiniASR
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
Language:Jupyter Notebook47 3 16
mmakiuchi/multimodal_emotion_recognition
Scripts used in the research described in the paper "Multimodal Emotion Recognition with High-level Speech and Text Features" accepted in the ASRU 2021 conference.
Language:Python46 2 28
pooya-mohammadi/audio-classification-pytorch
In this project, several approaches for training/finetuning an audio gender recognition is provided. The code can simply be used for any other audio classification task by simply changing the number of classes and the input dataset.
Language:Jupyter Notebook38 1 34
Hamtech-ai/wav2vec2-fa
fine-tune Wav2vec2. an ASR model released by Facebook
Language:Jupyter Notebook37 2 15
mt-upc/SHAS
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
Language:Python37 6 54
ECNU-Cross-Innovation-Lab/ShiftSER
[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations
Language:Python35 2 22
khanld/Wav2vec2-Pretraining
Wav2vec 2.0 Self-Supervised Pretraining
Language:Python35 2 03
HarunoriKawano/Wav2vec2.0
Implementation of the paper "wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations" in Pytorch.
Language:Python34 1 12
ttop32/wav2vec2-live-japanese-translator
real time japanese speech recognition translator using wav2vec2
Language:Jupyter Notebook33 1 23
lucasgris/wav2vec4bp
Wav2vec resources and models for Brazilian Portuguese
Language:Jupyter Notebook32 3 22
mikezzb/lyrics-sync
A deep learning lyrics-to-audio alignment system, generating synchronized lyrics from a song and its lyrics
Language:Jupyter Notebook28 3 11
egorsmkv/asr-corpus-creator
This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.
Language:Python27 2 283
daanzu/wav2vec2_stt_python
Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition
Language:Python24 5 13
hammaad2002/ASRAdversarialAttacks
An ASR (Automatic Speech Recognition) adversarial attack repository.
Language:Jupyter Notebook24 1 31
AmirAbaskohi/Automatic-Speech-recognition-for-Speech-Assessment-of-Persian-Preschool-Children
Preschool evaluation is crucial because it gives teachers and parents influential knowledge about children's growth and development. The COVID-19 pandemic has highlighted the necessity of online assessment for preschool children. One of the areas that should be tested is their ability to speak. Employing an Automatic Speech Recognition (ASR) system would not help since they are pre-trained on voices that differ from children's in terms of frequency and amplitude. Because most of these are pre-trained with data in a specific range of amplitude, their objectives do not make them ready for voices in different amplitudes. To overcome this issue, we added a new objective to the masking objective of the Wav2Vec 2.0 model called Random Frequency Pitch (RFP). In addition, we used our newly introduced dataset to fine-tune our model for Meaningless Words (MW) and Rapid Automatic Naming (RAN) tests. Using masking in concatenation with RFP outperforms the masking objective of Wav2Vec 2.0 by reaching a Word Error Rate (WER) of 1.35. Our new approach reaches a WER of 6.45 on the Persian section of the CommonVoice dataset. Furthermore, our novel methodology produces positive outcomes in zero- and few-shot scenarios.
Language:Jupyter Notebook21 1 11
mpoyraz/wav2vec2-turkish
Turkish Speech Recognition using Facebook's Wav2vec 2.0 models
Language:Python19 3 12
yamahigashi/Wav2Vec2FBX
Recognize speech from an audio file and convert it into animation FBX
Language:Python19 3 03
ECNU-Cross-Innovation-Lab/ENT
[ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
Language:Python18 2 01
kingabzpro/WOLOF-ASR-Wav2Vec2
Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.
Language:Jupyter Notebook17 1 07
scottykwok/cantonese-selfish-project
Cantonese Selfish Project 廣東話自肥企劃 at PYCON HK 2021
Language:Jupyter Notebook15 1 11
skit-ai/Map-Mix
The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at ICASSP-2023)
15 3 31
FernandoLpz/SpeechRecognition
This repository contains the implementation of an Automatic Speech Recognition system in python, using a client-server architecture with Web Sockets.
Language:Python14 2 12
techiaith/docker-huggingface-stt-cy
Adnabod lleferydd Cymraeg i'r Gymraeg gyda HuggingFace // Speech Recognition for Welsh with HuggingFace
Language:Python14 6 04

wav2vec2

PaddlePaddle/PaddleSpeech

s3prl/s3prl

audeering/w2v2-how-to

oliverguhr/wav2vec2-live

pszemraj/vid2cleantxt

habla-liaa/ser-with-w2v2

khanld/ASR-Wav2vec-Finetune

inboxpraveen/LLM-Minutes-of-Meeting

vietai/ASR

thevasudevgupta/gsoc-wav2vec2

tuanio/noisy-student-training-asr

Telegram-Zalo/zac2022-lyric-alignment

lstrgar/self-supervised-phone-segmentation

vectominist/MiniASR

mmakiuchi/multimodal_emotion_recognition

pooya-mohammadi/audio-classification-pytorch

Hamtech-ai/wav2vec2-fa

mt-upc/SHAS

ECNU-Cross-Innovation-Lab/ShiftSER

khanld/Wav2vec2-Pretraining

HarunoriKawano/Wav2vec2.0

ttop32/wav2vec2-live-japanese-translator

lucasgris/wav2vec4bp

mikezzb/lyrics-sync

egorsmkv/asr-corpus-creator

daanzu/wav2vec2_stt_python

hammaad2002/ASRAdversarialAttacks

AmirAbaskohi/Automatic-Speech-recognition-for-Speech-Assessment-of-Persian-Preschool-Children

mpoyraz/wav2vec2-turkish

yamahigashi/Wav2Vec2FBX

ECNU-Cross-Innovation-Lab/ENT

kingabzpro/WOLOF-ASR-Wav2Vec2

scottykwok/cantonese-selfish-project

skit-ai/Map-Mix

FernandoLpz/SpeechRecognition

techiaith/docker-huggingface-stt-cy