

Primary LanguagePython


This is a project for Chinese Tokenization.


  • Usage of segment_sentences:

      python segment_sentences.py [options] [arg]
  • Options:

      -h, --help            show this help message and exit
      -d, --debug           print the debug information of the segmentation,
                          default is not
      -f FILE, --file=FILE  segment sentences from the specified file
      -i, --interactive     go into interactive mode
      -o OUT, --out=OUT     write the segment result into the specified file
      -s SEPARATOR, --separator=SEPARATOR
                          specified the separator of the segmentation result
      -t TRAIN, --train=TRAIN
                          use the training set to train the algorithm
      -v, --version         output version info and exit