ivangtorre
AI Researcher. Working on Automatic Speech Recognition, NLP, DNNs, Linguistic Laws, Complex Systems and Nonlinear Dynamics
Language and Speech Laboratory-EHU, UPMSpain
ivangtorre's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
NVIDIA/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
zalandoresearch/fashion-mnist
A MNIST-like fashion product database. Benchmark :point_down:
kimiyoung/transformer-xl
fastai/course-nlp
A Code-First Introduction to NLP course
BenjiKCF/Neural-Net-with-Financial-Time-Series-Data
This solution presents an accessible, non-trivial example of machine learning (Deep learning) with financial time series using TensorFlow
archinetai/surgeon-pytorch
A library to inspect and extract intermediate layers of PyTorch models.
kensho-technologies/pyctcdecode
A fast and lightweight python-based CTC beam search decoder for speech recognition.
mightydeveloper/Deep-Compression-PyTorch
PyTorch implementation of 'Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding' by Song Han, Huizi Mao, William J. Dally
mphilli/English-to-IPA
Converts English text to IPA notation
Kozea/Pyphen
Hy-phen-ation made easy
Vicomtech/hate-speech-dataset
Hate speech dataset from Stormfront forum manually labelled at sentence level.
qute012/Wav2Keyword
Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.
nokpil/AgentNet
Pytorch implementation of AgentNet, which is designed for reveal hidden interactions and predict future dynamics of the unknown complex system.
ivangtorre/multifrac
This is a plugin for ImageJ2 for multifractal analysis of 2D and 3D images. Cite: MULTIFRAC: An ImageJ plugin for multiscale characterization of 2D and 3D stack images . IG Torre, R. J. Heck and AM Tarquis. SoftwareX, 12, 100574.
maxidl/wav2vec2
Vicomtech/itzuli-api-lib
Itzuli® Machine Translation Engine API libraries
ivangtorre/compression-principle-and-Zipf-s-law-of-brevity-in-infochemical-communication
This repository implements all the scripts used for processing data, computing and figure generation for the scientific paper: https://royalsocietypublishing.org/doi/10.1098/rsbl.2022.0162 "Compression principle and Zipf’s Law of brevity in infochemical communication". Please cite: Antoni Hernández-Fernández and Ivan G. Torre. Compression principle and Zipf’s Law of brevity in infochemical communication.
ivangtorre/physical-origin-of-lw
This package implements all the scripts used for computing and representing the resuls of the scientific paper: "On the physical origin of linguistic laws and lognormality in speech". If use, cite: Torre, I. G., Luque, B., Lacasa, L., Kello, C. T., & Hernández-Fernández, A. (2019). On the physical origin of linguistic laws and lognormality in speech. Royal Society Open Science, 6(8), 191023.
ivangtorre/pythreshold
This package implements the threshold algorithm for decimation and collapsing of time series. If use, please cite: "Torre, I. G., Luque, B., Lacasa, L., Luque, J., & Hernández-Fernández, A. (2017). Emergence of linguistic laws in human voice. Scientific reports, 7, 43862."
ivangtorre/Speech-pause-distribution-as-an-early-marker-for-Alzheimers-disease
The speech pauses duration corpus and scripts that ensure reproducibility of all results presented in the research paper. P. Pastoriza, I.G. Torre, F. Dieguez, I. Gomez, S. Gelado, J. Bello, A. Avila, J. Matias, V. Pytell, A. Hernandez-Fernandez (2022). Speech pause distribution as an early marker for Alzheimer’s disease. Speech Communication. 136, 107-117
Vicomtech/ASVspoophone
The ASVspoophone corpus is the telephonic version of the ASV Spoof 2019 corpus found at https://www.asvspoof.org It contains the telephonic versions of the audios used for the countermeasure (CM) ASV Spoof 2019 challenge, which have been created by transferring each of them through real land-land, mobile-land and land-mobile telephonic channels. The results are the corresponding 8 kHz 8 bit A-Law versions of the originial audios, which can be used to train anti-spoofing systems that will be used on real telephonic scenarios such as call and contact centres.