BreezeWhite/oemer

the second model's training detail

Closed this issue · 3 comments

1.test
I first downloaded your model, and run ete.py to test, everything is ok.
I find the second model's output name and shape is ['conv2d_25'] (None, 288, 288, 4).

2.train (only second model)
I have some questions when I train the second model.
1). train.py
from constant import CHANNEL_NUM
It should be from constant_min import CHANNEL_NUM, right?
2). constant_min.py
how to set CLASS_CHANNEL_LIST and CHANNEL_NUM?

thank you @BreezeWhite

Hi @lvpchen

  1. I don't quite understand your question here.
  2. Yes, it should be imported from constant_min. I guess I was doing some other experiments and didn't update the code into git.
  3. It depends on your needs. Depending on how you want the model to predict a certain set of symbols and put the results into which output channel.

I first downloaded the model, and run ete.py, everything is ok. And i find the second model's output name and shape is ['conv2d_25'] (None, 288, 288, 4). The first channel is symbol image, the second channel is stem/rest image, the third channel is notehead image, the last channel is clefs/key image.

Now I want to replicate the training process of Model 2. There are some issues during training.

some detail for my modifications:
first: I have modified some of the unet and built a network similar to yours.
second: I Modified CLASS_CHANNEL_LIST in constant_min.py.
CLASS_CHANNEL_LIST = [
[52, 97, 100, 99, 98, 101, 102, 103, 104, 96, 163], # stem, rests
[35, 37, 38, 39, 41, 42, 43, 45, 46, 47, 49], # notehead
[
80, 78, 79, 74, 70, 72, 76, # sharp, flat, natural
10, 13, 12, 19, 11, 20, # clefs
]
]
third:
change (CHANNEL_NUM = len(CLASS_CHANNEL_LIST) + 2)to (CHANNEL_NUM = len(CLASS_CHANNEL_LIST) + 1)
fourth:
change total_chs = len(set(CLASS_CHANNEL_MAP.values())) + 2 to total_chs = len(set(CLASS_CHANNEL_MAP.values())) + 1

after trainning, I found model's output name and shape is ['conv2d_25'] and (None, 288, 288, 4). But the result is so bad.

A lot of spaces could go wrong that leads to a bad performance, even it's a very tiny offset to the data. It does take time to fine-tune the parameters, or you are using the wrong method, model architecture, etc. I'm afraid I cannot help much on this and I would suggest find other people to help you debug the code.