How to use our public age and gender model

An introduction to our model for age and gender prediction based on wav2vec 2.0. The model is available from doi:10.5281/zenodo.7761387 and released under CC BY-NC-SA 4.0. The model was created by fine-tuning the pre-trained wav2vec2-large-robust model on aGender, Mozilla Common Voice, Timit and Voxceleb 2. We provide two variants of the model: one with all 24 transformer layers and a stripped-down version with six transformer layers. The models were exported to ONNX. The original Torch model is hosted at Hugging Face: 6 layers and 24 layers. Further details are given in the associated paper and notebook.

License

The model can be used for non-commercial purposes, see CC BY-NC-SA 4.0. For commercial usage, a license for devAIce must be obtained. The source code in this GitHub repository is released under the following license.

Quick start

Create / activate Python virtual environment and install audonnx.

$ pip install audonnx

Load the model with six layers and test on random signal.

import audeer
import audonnx
import numpy as np


url = 'https://zenodo.org/record/7761387/files/w2v2-L-robust-6-age-gender.25c844af-1.1.1.zip'
cache_root = audeer.mkdir('cache')
model_root = audeer.mkdir('model')

archive_path = audeer.download_url(url, cache_root, verbose=True)
audeer.extract_archive(archive_path, model_root)
model = audonnx.load(model_root)

sampling_rate = 16000
signal = np.random.normal(size=sampling_rate).astype(np.float32)
model(signal, sampling_rate)

{'hidden_states': array([[ 0.02783544,  0.01402022,  0.03839185, ...,  0.00786646,
         -0.09332313,  0.0915948 ]], dtype=float32),
 'logits_age': array([[0.3961048]], dtype=float32),
 'logits_gender': array([[ 0.32810774, -0.56528044,  0.0317882 ]], dtype=float32)}

The 'hidden_states' are the pooled states of the last transformer layer, 'logits_age' provides scores for age in a range of approximately 0...1 (== 100 years) and 'logits_gender' expresses the confidence for being female, male or child.

Tutorial

For a detailed introduction, please check out the notebook.

$ pip install -r requirements.txt
$ jupyter notebook notebook.ipynb

Citation

If you use our model in your own work, please cite the following paper:

@inproceedings{,
   author = {Felix Burkhardt and Johannes Wagner and Hagen Wierstorf and Florian Eyben and Björn Schuller},
   editor = {Peter Jax and Sebastian Mölller},
   journal = {15th ITG conference on Speech Communication},
   title = {Speech-based Age and Gender Prediction with Transformers},
   year = {2023},
}

audeering/w2v2-age-gender-how-to

How to use our public age and gender model

License

Quick start

Tutorial

Citation