vsnupoudel/Audio-Classification--transfer-learning-and-fine-tuning--Android-application

YAMNET Training for Bird Dataset

Maanikya opened this issue · 5 comments

Dear Sir,

First of all, thank you very much for your email and replying at the earliest.

My problems are as follows:

  1. I'm a beginner to ML and DL field.
  2. I'm facing problem on replacing the dataset you used into the dataset that I want to use without using CSV file as Metadata.
  3. I'm finding it hard to wrap my head around the functions that you used to creating "train_ds" variable to feed to the model for training

My goal:
To develop a YAMNET model for bird audio classification and to implement it as an Android App.

I'm doing this as a team for our major project.

Link for the Dataset: https://www.kaggle.com/datasets/maanikya/yamnet-dataset-v2

Sir, can you please help us in tweaking the code to fit our requirements?

Not able to look into all details now, but following are some recommendations.

  1. tensorflow io seems to not work on resampling.. Please use scipy... Notebook example below
    https://www.tensorflow.org/hub/tutorials/yamnet
  2. Original notebook ( where tfio is used)
    https://www.tensorflow.org/tutorials/audio/transfer_learning_audio
  3. Follow the dataset conventions in the notebook. Use the same conventions in your csv file ( which has path to audio files)
  4. For training the data, I believe first the embeddings are being extracted. And these are used as features to finetune the classification.
    I can revisit the notebook later, and update on 4.
    Hope it was useful..
    Also, please paste and specific errors you get while running the notebooks..

Also, tfio apparently works with a specific tf version.
IMG_20230419_013349