/Lieon-ai

Real-time Voice Phishing(Lie) Classifier using Echo State Networks (Echo State Network 기반 실시간 음성 거짓말 분류 인공지능)

Primary LanguageJupyter Notebook

Lieon-ai

Real-time Voice Phishing(Lie) Classifier using Echo State Networks

architecture

Requirements

All code was written in Python>=3.7.

To download the libraries used in this project, enter the following command:

!pip install -r requirement.txt

Data

1. Labeling

For Speaker Diarization, we utilized a pretrained model provided by the Pyannote library.

  • The voices of the scam callers(voice phishing scammers) were labeled as 1,
  • And the voices of the recipients were labeled as 0.

2. Augmentation

We tried augmentation method to expand the amount of data.
Time strech, pitch shift and adding noise were used to augmetation.


3. Generation

To deal with the lack of data despite augmentation, we used generative AI for producing audio data which have biological features similar to the original data. We conducted a data generation experiment using the two models below:

AAGAN : Audio-to-Audio Generative Adversarial Networks (made by Do-Hyeon Lim)

MVGAN : Audio-to-Audio GAN using Mel-spectrogram Generator and HiFiGAN Vocoder (made by Do-Hyeon Lim)

Feature

  • MFCC(total 20 of feature vectors)
  • Pitch
  • F0(Fundamental Frequency)
  • Spectral Flux
  • Spectral Frequency

Model (ongoing)

Classifier : Echo State Network

  • A specific kind of recurrent neural network (RNN) designed to efficiently handle sequential data based on Reservoir Computing.

Evaulation (ongoing)

TBA (optimizing)

  • Accuracy
  • F1 Score


Reference

[1]https://doi.org/10.48550/arXiv.1712.04323 (Github : https://github.com/stefanonardo/pytorch-esn)
[2]https://doi.org/10.48550/arXiv.2010.05646 (Github : https://github.com/jik876/hifi-gan)