Pinned Repositories
barista
Barista is an open-source framework for concurrent speech processing.
fed-multimodal
[KDD 2023] FedMultimodal
fed-ser-semi
Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
gard-adversarial-speaker-id
Adversarial attack and defense strategies for deep speaker recognition systems
mica-deep-mcca
Deep Multiset Canonical Correlation Analysis - An extension of CCA to multiple datasets
mica-MovieCLIP
This repository contains the codebase for MovieCLIP: Visual Scene Recognition in Movies
mica-race-from-face
Predicting race from faces for movie data
mica-speech-activity-detection
Robust Speech Activity Detection (SAD) in movie audio
mica-subtitle-aligned-movie-sounds
A dataset for Audio-Visual Sound Event Detection in Movies
peft-ser
PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models (Accepted to 2023 ACII)
USC SAIL's Repositories
usc-sail/fed-multimodal
[KDD 2023] FedMultimodal
usc-sail/peft-ser
PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models (Accepted to 2023 ACII)
usc-sail/mica-MovieCLIP
This repository contains the codebase for MovieCLIP: Visual Scene Recognition in Movies
usc-sail/mica-subtitle-aligned-movie-sounds
A dataset for Audio-Visual Sound Event Detection in Movies
usc-sail/fed-ser-semi
Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
usc-sail/trust-ser
Trustworthy Speech Emotion Recognition
usc-sail/fed-ser-leakage
usc-sail/mica-screenplay-parser
Movie Screenplay Parser
usc-sail/tiles-dataset-release
Code accompanying the TILES dataset paper and data release. Data at https://tiles-data.isi.edu.
usc-sail/egocentric-fg-speech-detection
Egocentric Foreground Speech Detection
usc-sail/SynthAudio
Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?
usc-sail/ggs_driving
Segmentation Algorithms for Physiological Time Series
usc-sail/M3BERT
A music transformer that extracts representations of audio using several hundreds of thousands of music clips. Fine-tuning is done with diverse end-tasks to enrich the pre-trained representations.
usc-sail/speech-emotion-privacy-trust
usc-sail/mica-character-attribute-extraction
Character tropes, Forensic Interviews, and Character Attributes
usc-sail/mica-character-attribution
Learning descriptions of literary and fictional characters
usc-sail/mica-character-coref
Coreference in Movie Scripts
usc-sail/mica-context-emotion-recognition
Repository for context based emotion recognition
usc-sail/SCMIA-unsupervised-ASD
Repo for SCMIA and GSCMIA
usc-sail/tiles-audio-arousal
TILES speaking pattern (speaking frequency, duration, arousal rating) using audio data
usc-sail/tiles-day-night
usc-sail/tiles-time-series
The repo contains time-series implement for SAIL-TILES projects: preprocessing, segmentation, filtering, clustering, imputation, visualization...
usc-sail/mica-muse-india-tv
Data and code for analysis of the India TV Show Study
usc-sail/mica-muse-loreal
Codebase for analyzing RP metrics for MUSE LoREAL study
usc-sail/CLAP
Contrastive Language-Audio Pretraining
usc-sail/HICA-active-speaker-detection
Repository for the HICA implementation associated with TMM submission
usc-sail/llama
Inference code for LLaMA models
usc-sail/mica-scriptsonscreen-scripts
Contains code to scrape scriptsonscreen scripts website and scrapped data
usc-sail/SAIL-CCMI
Outline of the webpage for CCMI subgroup
usc-sail/whisper
Robust Speech Recognition via Large-Scale Weak Supervision