YatingMusic/remi

Checkpoints saved with a NAN value

raphkhan opened this issue · 2 comments

Hi, thanks for your amazing work.

I'm trying to finetune your model on a smaller dataset (292 midi files). I'm using the REMI-tempo-chord-checkpoint as the base model. However, my checkpoints are saved with a NAN value, from the first epoch (e.g, model-000-nan.data)

I also have this two warning messages during the training process:

/miniconda3/envs/tfEnv/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
/miniconda3/envs/tfEnv/lib/python3.9/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars

I first checked if something was wrong during the event extraction. It seems that the extraction worked well: len(all_events)=292 that is the size of my dataset. Same with all_words.

However, the segments length is only 3. That means training_data = 3 and num_batches = 0

So I guess something went wrong in that part, but I don't know how to fix it:

# to training data
self.group_size = 5
segments = []
for words in all_words:
    pairs = []
    for i in range(0, len(words)-self.x_len-1, self.x_len):
        x = words[i:i+self.x_len]
        y = words[i+1:i+self.x_len+1]
        pairs.append([x, y])
    pairs = np.array(pairs)

    # abandon the last
    for i in np.arange(0, len(pairs)-self.group_size, self.group_size*2):
        data = pairs[i:i+self.group_size]
        if len(data) == self.group_size:
            segments.append(data)
segments = np.array(segments)

Does anyone have an idea how to fix it? Thanks a lot

gkoyu commented

Yeah i think that part went wrong when i try the model in my dataset and get minus value in the len(pairs)-self.group_size, my pairs size is (1,2,512)

Did you by any chance solved it? And know how?
Sorry I know it's an old issue.