A utility plugin for Gradle, allowing lightweight management of speech data in FLAC format and corresponding metadata in YAML format (see below).
Java (8 or newer) and SoX must be installed. FLAC must be installed for testing.
See https://plugins.gradle.org/plugin/org.m2ci.msp.flaml
This plugin adds a flaml
extension, which is configured with the (relative) paths to a FLAC and YAML file like this:
flaml {
flacFile = 'foobar.flac'
yamlFile = 'foobar.yaml'
}
Applying this plugin to a project adds several tasks, which are configured as follows.
yamlFile
, default:flaml.yamlFile
destDir
, default:layout.buildDirectory.dir('lab')
yamlFile
, default:flaml.yamlFile
destDir
, default:layout.buildDirectory.dir('text')
flacFile
, default:flaml.flacFile
yamlFile
, default:flaml.yamlFile
textGridFile
, default:layout.buildDirectory.file("${project.name}.TextGrid")
flacFile
, default:flaml.flacFile
yamlFile
, default:flaml.yamlFile
destDir
, default:layout.buildDirectory.dir('wav')
properties
, default: (empty)
destFile
, default:layout.buildDirectory.file('comments.properties')
srcFiles
, default:layout.buildDirectory.dir('wav')
commentsFile
, default:generateComments.destFile
flacFile
, default:layout.buildDirectory.file("${project.name}.flac")
srcFiles
, default:generateFlac.srcFiles
commentsFile
, default:generateComments.destFile
yamlFile
, default:layout.buildDirectory.file("${project.name}.yaml")
srcFile
, default:extractTextGrid.textGridFile
yamlFile
, default:layout.buildDirectory.file("${project.name}.yaml")
labDir
, default:extractLabFiles.destDir
yamlSrcFile
, default:flaml.yamlFile
yamlDestFile
, default:layout.buildDirectory.file("${project.name}.yaml")
textDir
, default:,extractTextFiles.destDir
yamlSrcFile
, default:flaml.yamlFile
yamlDestFile
, default:layout.buildDirectory.file("${project.name}.yaml")
The FLAML convention assumes a FLAC file and corresponding YAML file describing its contents.
The YAML file is expected to contain a list of utterances, each of which is a map that must have
- a
prompt
key, where the prompt is used as the basename for individual utterance files - a
text
key, containing the orthographic text contents of the utterance - a
start
key providing the start time (in seconds) of the utterance in the FLAC file - an
end
key, providing the end time (in seconds) of the utterance in the FLAC file - optionally a
segments
key, providing a list of segments in the utterance, where each segment is a map that must have- a
lab
key, providing the label of the segment - a
dur
key, providing the duration (in seconds) of the segment
- a
Below is an example:
- prompt: foo
text: Foo.
start: 0.1
end: 0.4
segments:
- lab: f
dur: 0.1
- lab: u
dur: 0.2
- prompt: bar
text: Bar.
start: 0.6
end: 0.9
segments:
- lab: b
dur: 0.1
- lab: a
dur: 0.1
- lab: r
dur: 0.1