BreezeWhite/oemer

Training dataset

Closed this issue · 6 comments

I recently had the opportunity to explore your system, and I was impressed by its capabilities. However, I encountered an issue while trying to access the dataset used for training the first model. It appears that the dataset is currently unavailable on the provided page. If you happen to have the dataset, would it be possible for you to provide it? Thanks :)

Hi @pingpeng1017 , thanks for your feedback ^^ I think it's better to send an inquiry email to them and also let them know the dataset download page is down, since I am not sure if it's proper for me to directly share the data with you.

Thanks for your reply. I have discovered that the time signature is not being extracted and I'm curious whether it's not recognised at all, or if it's actually recognised but there is no implementation of the code to convert it to XML. I would greatly appreciate it if you could let me know the possibility of adding this functionality as I would like to have a go :)

The time signature does appear in the predictions of the two UNet models, but in raw pixel format. I did not managed to recognize the numbers of the time signature since it would take more efforts but would not impact the listening experience too much. Still you could try to recognize the symbol if it is important in your case.

Does that mean it's necessary to retrain the UNet models or as it's already appearing in the predicted images, would it be feasible to work with these raw pixel format images and change the SVM to recognise the time signature symbols? I'm currently working on a program that converts music notation into Braille, and being able to recognise time signature symbols would be a game-changer for my project. Any help or guidance you can provide would be incredibly valuable.

Yes, it's already in the predicted image. You only need to figure out a way to extract which pixels belong to the time signature and what numbers do they represent. For number recognition, you could train another SVM model to recognize, too.

Hi, this is me again! Sorry for so many questions.
I've trained an SVM model to recognise numbers in the image and I remembered that you said I only need to find a way to extract pixels belonging to the time signature. It seems like you have implemented a method to extract pixels for three specific symbols below:

stems_rests = np.where(sep==1, 1, 0)
notehead = np.where(sep==2, 1, 0)
clefs_keys = np.where(sep==3, 1, 0)

I'm curious about how you managed to obtain these values.