语音识别时，yml文件中的validation_inputs_data选项该怎么填

Question

语音识别时，yml文件中的validation_inputs_data选项该怎么填

fivenick opened this issue 4 years ago · 1 comments

fivenick commented 4 years ago

Before you open an issue, please make sure you have tried the following steps:

Make sure your environment is the same with (https://mace.readthedocs.io/en/latest/installation/env_requirement.html).
Have you ever read the document for your usage?
Check if your issue appears in HOW-TO-DEBUG or FAQ.
The form below must be filled.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
NDK version(e.g., 15c):
GCC version(if compiling for host, e.g., 5.4.0):
MACE version (Use the command: git describe --long --tags):
Python version(2.7):
Bazel version (e.g., 0.13.0):

Model deploy file (*.yml)

......

Describe the problem

mace中的示例有图片，validation_inputs_data中指定的文件中都是存放的一张图片信息，但我现在是想同时测试多条语音，每条语音的长度还不一样，那么我是讲所有语音的数据都放入一个文件中吗，这样框架会自动切分batch吗，还是我自己将语音分成多个batch，然后每个batch放到一个文件中？
在量化时，input_dir 选项中指定的文件夹下，每个文件中是存放一条语音的数据，还是一个batch的数据？
我看到示例中都是以图片举例，但每张图片大小是一样的，所以每张图片可以直接仍到神经网络中跑，但语音不一样，每条语音长短不一，那么这个batch是框架会自动帮忙切，还是我自己要先切好，然后将切好的数据放到每个文件中呢？

Answer 1 · 2021-04-22T00:46:11.000Z

@fivenick

validation_inputs_data, 这个只会一次验证一个数据。如过想测试多个数据，可以使用如下命令：
python tools/converter.py run --config=${CONF_FILE} --input_dir /data/local/tmp/input_dir --output_dir /data/local/tmp/output_dir
其中input_dir用于存放多个输入数据，运行之后输出会以文件形式存在output_dir里
input_dir：需要自己切好，放到每个文件中