Running the Joshua Decoder: --------------------------- If you wish to run the complete machine translation pipeline, Joshua includes a black-box implementation that enables the entire pipeline to be run by typing a single restartable command. See the documentation for a walkthrough and more information about the many options available to the pipeline. - web: http://joshua-decoder.org/5.0/pipeline.html - local mirror: ./joshua-decoder.org/5.0/pipeline.html Manually Running the Joshua Decoder: ------------------------------------ To run the decoder, first set these environment variables: export JAVA_HOME=/path/to/java # maybe /usr/java/home export JOSHUA=/path/to/joshua You might also find it helpful to set these: export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8 Then, compile Joshua by typing: cd $JOSHUA ant all The basic method for invoking the decoder looks like this: cat SOURCE | JOSHUA -c CONFIG > OUTPUT You can test this using the sample configuration files and inputs can be found in the example/ directory. For example, type: cat examples/example/test.in | $JOSHUA/bin/decoder -c examples/example/joshua.config The decoder output will load the language model and translation models defined in the configuration file, and will then decode the five sentences in the example file. There are a variety of command line options that you can feed to Joshua. For example, you can enable multithreaded decoding with the -threads N flag: cat examples/example/test.in | $JOSHUA/bin/decoder -c examples/example/joshua.config -threads 5 The configuration file defines many additional parameters, all of which can be overridden on the command line by using the format -PARAMETER value. For example, to output the top 10 hypotheses instead of just the top 1 specified in the configuration file, use -top-n N: cat examples/example/test.in | $JOSHUA/bin/decoder -c examples/example/joshua.config -top-n 10 Parameters, whether in the configuration file or on the command line, are converted to a canonical internal representation that ignores hyphens, underscores, and case. So, for example, the following parameters are all equivalent: {top-n, topN, top_n, TOP_N, t-o-p-N} {poplimit, pop-limit, pop-limit, popLimit} and so on. For an example of parameters, see the Joshua configuration file template in $JOSHUA/scripts/training/templates/tune/joshua.config or the online documentation at joshua-decoder.org/4.0/decoder.html. There is a wealth of information in the online documentation. After you have successfully run the decoding example above, we recommend that you take a look at the Joshua pipeline script, which allows you to do full end-to-end training of a translation model. It is stored in $JOSHUA/examples