Ghadjeres/DeepBach

Fixing soprano part at generation time

Closed this issue · 2 comments

Hi again! Since I didn't find the script for fixing the soprano part at generation time as described in the paper, I wrote up the following code, where I pass tensor_chorales from test_dataloader into the generation() method and set voice_index_range to [1,3].

train_dataloader, val_dataloader, test_dataloader = bach_chorales_dataset.data_loaders(batch_size=128, split=(0.85, 0.10))

for tensor_chorale_batch, tensor_metadata_batch in test_dataloader:
    tensor_chorale_batch = cuda_variable(tensor_chorale_batch).long().clone().to('cpu')
    tensor_metadata_batch = cuda_variable(tensor_metadata_batch).long().clone().to('cpu')
    for i in range(tensor_chorale_batch.size(0)):
        tensor_chorale = tensor_chorale_batch[i]
        tensor_metadata = tensor_metadata_batch[i]

        score, tensor_chorale, tensor_metadata = deepbach.generation(
            num_iterations=num_iterations,
            sequence_length_ticks=sequence_length_ticks,
            tensor_chorale=tensor_chorale,
            tensor_metadata=tensor_metadata,
            voice_index_range=[1, 3],
        )

However, there are a couple problems with this that make me suspect I'm on a different track than the paper.

  1. All of the examples in the tensor dataset are segments 8 beats in length that start at any possible offset (in 16th-note increments) from an original chorale. This means that in the generated segments of 8 beats, the entire chorale is shifted such that the note onsets don't appear on actual beats.
  2. The generation() method doesn't seem to randomly initialize voices that we apply the pseudo-Gibbs algorithm to, although it does do this for the timestep range.
  3. It seems like generations in the paper were longer than 8 beats, since it was possible to extract two 12-second segments.

Probably we shouldn't be using test_dataloader, but I'm not sure where else the test data comes from. Thanks again!

Hi,
Yes you're right. I did not use the test_dataloader to impose soprano parts for reharmonization but chose entire chorales from the test set, extracted their melodies and used this as a constraint. So that you are assured the constraints start on beats and you can impose the size of the generated chorale.
Also, the initialization scheme in the generation might be incomplete and might not take into account the voice_index_range. I may have not reimplemented everything when rewriting the model in Pytorch.
Sorry for that, but it is quite easy to fix.

Hi! I see, thanks for describing your generation-time procedure with fixing the soprano part. I will try to re-implement it.

I also noticed that the score for voice_index_range was not re-initialized to random values for the voices being sampled. (It does, however, do this for time_index_range_ticks!) This was indeed a simple fix.