yangdongchao/SimpleSpeech

Implementation incomplete and information missing.

roebel opened this issue · 0 comments

Hello

I have read your paper with interest and wanted to look into the code to see some of the details that are not mentioned in the paper. After looking into your code I have a few questions:

  • Am I right that the model reflecting the state of the paper would be
    ldm.models.scalar16k.ScalarAE. This is the model you refer to in your config file.

  • Unfortunately, the code for this model is not incomplete and broken: It starts right away with an empty function definition,

def get_padding(kernel_size, dilation=1): 

class ...

and therefore this module can even not be imported, let alone run.

  • concerning the coherence of the repos and the paper. In the paper, you describe that the scalar vector quantization module consists of a sequence of Conv1D convolutions. Here in the implementation, you use quite a few block like PreProcess, ResEncoder and others. Unfortunately, the implementation of these various blocks is not available provided. Additionally, the config file appears to live under
    scalar_config: /home/jupyter/data/checkpoints/codec/16k/config.yaml
    and is not available either. So it is very difficult to get anything about of this.

Do you plan to update this repository with the information that is missing?

Thanks