shammur
Interested in analyzing and understanding human conversation. Main focus: speech overlaps, turn-takings, speech discourse, code-switching, explainability.
QCRI
Pinned Repositories
ABiViRNet
Attention Bidirectional Video Recurrent Net
Agglomerative_Clustering
Notebook from my article of breaking down the Agglomerative Clustering. https://towardsdatascience.com/breaking-down-the-agglomerative-clustering-process-1c367f74c7c2
airline_social_media_post_categorization
Arabic-Offensive-Multi-Platform-SocialMedia-Comment-Dataset
Arabic Dialectal Offensive Language dataset from social media comments on news post from facebook, twitter and youtube platforms
arabic_news_post_categorization
Categorize social media news post (short text) to multiple categorizes including politics, health, environment, sports among others
arabic_offensive_language_detection
arabic offensive language detection model from social media comments and posts
kaldi
This is the official location of the Kaldi project.
pyVAD
A simple VAD pipeline based on pyAudioAnalysis
SemEval2022Task3
The PreTENS shared task hosted at SemEval 2022 aims at focusing on semantic competence with specific attention on the evaluation of language models with respect to the recognition of appropriate taxonomic relations between two nominal arguments (i.e. cases where one is a supercategory of the other, or in extensional terms, one denotes a superset of the other).
shammur's Repositories
shammur/airline_social_media_post_categorization
shammur/arabic_news_post_categorization
Categorize social media news post (short text) to multiple categorizes including politics, health, environment, sports among others
shammur/AI4Voice
This repo contains the code for "Voice Disorder Analysis: A Transformer-based Approach"
shammur/alt_public
ALT research group publications
shammur/AnyGPT
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
shammur/asr-v1-offline-fanar2024
shammur/Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
shammur/charsiu
Charsiu: A neural phonetic aligner.
shammur/chatgpt-wrapper
API for interacting with ChatGPT and GPT4 using Python and from Shell.
shammur/dataset_automated_medical_transcription
Dataset for training machine learning model for automatically generating psychiatric case notes from doctor-patient conversations.
shammur/dynamic-superb
The official repository of Dynamic-SUPERB.
shammur/iwslt22-dialect
IWSLT 2022 Dialectal Speech Translation Shared Task
shammur/LLM-Codec
The open source code for LLM-Codec
shammur/Machine-Learning-Interviews
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
shammur/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
shammur/MOEhackathon
shammur/Multilingual-PR
Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021) and WavLM (2022) pretrained on a corpus of English speech that we will use in various ways to perform phoneme recognition for different languages with a network trained with Connectionist Temporal Classification (CTC) algorithm.
shammur/multiview_learning
shammur/NeMo
NeMo: a toolkit for conversational AI
shammur/news_categorization_english
shammur/primock57
Dataset of 57 mock medical primary care consultations: audio, consultation notes, human utterance-level transcripts.
shammur/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
shammur/shammur.github.io
shammur/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
shammur/spoken-adi-demo
shammur/SpokenStoryCloze
A spoken version of the textual story cloze benchmark
shammur/tmux
tmux source code
shammur/tts-scores
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models
shammur/wave-to-syntax
shammur/xlstm
Official repository of the xLSTM.