SimpleTokenizer Natural language processing research Requirement: java 1.7 Tokenizer.jar is a runnable program which reads string from standard input input: "Hello World. 中文斷詞" output: hello world 中 文 斷 詞