Checkpoints saved with a NAN value
raphkhan opened this issue · 2 comments
Hi, thanks for your amazing work.
I'm trying to finetune your model on a smaller dataset (292 midi files). I'm using the REMI-tempo-chord-checkpoint
as the base model. However, my checkpoints are saved with a NAN
value, from the first epoch (e.g, model-000-nan.data
)
I also have this two warning messages during the training process:
/miniconda3/envs/tfEnv/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
/miniconda3/envs/tfEnv/lib/python3.9/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
I first checked if something was wrong during the event extraction. It seems that the extraction worked well: len(all_events)=292
that is the size of my dataset. Same with all_words
.
However, the segments length is only 3. That means training_data = 3
and num_batches = 0
So I guess something went wrong in that part, but I don't know how to fix it:
# to training data
self.group_size = 5
segments = []
for words in all_words:
pairs = []
for i in range(0, len(words)-self.x_len-1, self.x_len):
x = words[i:i+self.x_len]
y = words[i+1:i+self.x_len+1]
pairs.append([x, y])
pairs = np.array(pairs)
# abandon the last
for i in np.arange(0, len(pairs)-self.group_size, self.group_size*2):
data = pairs[i:i+self.group_size]
if len(data) == self.group_size:
segments.append(data)
segments = np.array(segments)
Does anyone have an idea how to fix it? Thanks a lot
Yeah i think that part went wrong when i try the model in my dataset and get minus value in the len(pairs)-self.group_size, my pairs size is (1,2,512)
Did you by any chance solved it? And know how?
Sorry I know it's an old issue.