A server that acts as a facade for speech synthesis. It supports Filibuster as synthesis engine.
This is guide book to the implementation.
The official MTM documentation is done in Confluence.
- Context
- Functional Overview
- Quality Attributes
- Constraints
- Principles
- Software Architecture
- External Interfaces
- Code
- Data
- Infrastructure Architecture
- Deployment
- Operation and Support
- Decision Log
MTM need to be able to parallelize speech synthesisation and synthesise more than one book at a time.
The solution is to implement a web application that hides concrete speech synthesisers. The web application accepts request from more than one client and is therefore able to parallelize the synthesization by dispatching jobs to many synthesise instances.
The primary client is PipeOnline that the production team is using when creating books.
A sentence that should be synthesised is sent to the system and a synthesised sound is returned.
The synthesisers, filibusters, have a limited life time and will be recreated after this life span. The synthesizers available are listed on the front page of the web application.
It is possible to verify that the system is able to synthesise any sentence through the user interface. Click on 'Test Synthesize' in the menu bar. You will be able to enter any text that should be synthesised in the text area. The main purpose is for testing the system and verifying that th ecomponents are possible to connect and sound generated.
The logs generated from the system is also available from the web application. They are offered as a convenience for the maintainers of the system. They are also available on the executing host. Reading them on the host requires ssh access.
The configuration of the system is possible to review using the 'Configuration' in the menu bar. The configuration is the same configuration as the system is started with.
The release history is available from the 'About' menu.
There are no apparent quality attributes that we must honor. Synthesising books takes time and the bottle neck isn't in the dispatching server.
Speech synthesisers consume a lot of memory. It is possible to configure the maximum number of synthesisers and the minimum memory that should be available. We have started with 6 filibusters and they requires 4 Gb of memory. These number should be subject for revision when we know more about how the system behaves.
MTM is a mainly Java shop and the implementation is done using Java.
The implementation is done with a focus on the interaction between the components. Anything related to each other lives in the same package.
The main entry point to each package is a resource that is reachable from using a web browser or a REST client.
The resources are wired through the Main class.
The external interface to the system, except the web based user interface, is one REST resource.
It is hidden behind /synthesize
and it takes the sentence that should be synthesised as a query parameter.
The returned value is a JSON document with a byte array that should be interpreted as as a wav file.
There is a Java client implemented that removes the need for low level knowledge about the communication with the server.
The server application is built using Dropwizard.
The web interface is built using Mustache templates.
The styling is done using Boostrap.
Configuration is done with a YAML file and follows the standard Dropwizard format.
The client is implemented using Jersey and Jackson.
The system is built using Gradle wrapper.
./gradlew clean shadowJar
Creating an RPM for testing is done using
./gradlew clean buildRpm
Running the server local
java -jar server/build/libs/server-1.0.15-SNAPSHOT-all.jar server configuration-fake.yaml
The version number above is not correct, check your build directory for the correct one.
It is possible to install on a local Centos. The RPM is possible to install using
rpm -ivh rpm-file
There is no specific data associated with this system. It is a facade to other systems and doesn't keep a state of its own.
The system is deployed and executed MTMs standard RHEL virtual server.
It is deployed on RHEL as an RPM. The latest version is available at MTMs local YUM repository.
Each environment at MTM knows which repository to use. It is therefore easy to use Puppet for installation.
Deploying a new version is done by
- Promote an RPM package to the repo a specific environment fetches its packages from
- Trigger or wait for Puppet to execute. The current, old version, will be updated with the latest found in the repository
- Repeat the steps above when promoting to test or production
A user- and admin-interface is available at
Environment | User interface | Admin interface |
---|---|---|
Test | http://pipeutv1.mtm.se:9090 | http://pipeutv1.mtm.se:9091 |
Acceptance test | http://pipetest1.mtm.se:9090 | http://pipetest1.mtm.se:9091 |
Production | http://pipeonline.mtm.se:9090 | http://pipeonline.mtm.se:9091 |
The system is executed as a RHEL service.
Checking that the server is running is done as root with the command
service speech-synthesis-server status
Starting is done
service speech-synthesis-server start
Restarting is
service speech-synthesis-server restart
The logs are available at the location that the configuration logHome
is set to. This is
probably /var/log/mtm/speech-synthesis-server
The configuration is stored in /etc/opt/speech-synthesis-server/configuration.yaml
There is an admin interface available. The port is defined in the configuration. It is probably 9090
- The decision to use a REST api is based on the need for loose coupling. Other system should be able to connect without too much knowledge about the inner workings of the implementation. MTM has a record of compile time dependencies between systems and this has turned out to be unnecessary complicated.