Pinned Repositories
audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
dl-tutorial
a quick tutorial of deep learning
DUAL-textless-SQA
Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless Spoken Question Answering with Speech Discrete Unit Adaptive Learning" paper.
fromage
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
GenRepASD
Pytorch implementation of Deep Generic Representations for Domain-Generalized Anomalous Sound Detection: https://arxiv.org/abs/2409.05035
OCR_RaspberryPi_edgetpu
speech-tutorial
spolacq
TVLT
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
word-discovery
Word Discovery in Visually Grounded, Self-Supervised Speech Models
Phuriches's Repositories
Phuriches/OCR_RaspberryPi_edgetpu
Phuriches/GenRepASD
Pytorch implementation of Deep Generic Representations for Domain-Generalized Anomalous Sound Detection: https://arxiv.org/abs/2409.05035
Phuriches/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Phuriches/dl-tutorial
a quick tutorial of deep learning
Phuriches/DUAL-textless-SQA
Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless Spoken Question Answering with Speech Discrete Unit Adaptive Learning" paper.
Phuriches/fromage
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
Phuriches/speech-tutorial
Phuriches/spolacq
Phuriches/TVLT
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
Phuriches/word-discovery
Word Discovery in Visually Grounded, Self-Supervised Speech Models