/NNSegmentation

Word segmentation using neural networks based on package https://github.com/SUTDNLP/LibN3L

Primary LanguageC++

NNSegmentation

NNSegmentation is a package for Word Segmentation using neural networks based on package LibN3L. It includes different combination of Neural network architectures (TNN, RNN, GatedNN, LSTM and GRNN) with Objective function(Softmax, CRF Max-Margin, CRF Maximum Likelihood). It also provides the capability of combination of Sparse feature along with above models. In addition, this package can easily support various user-defined neural network structures.

Performance

Please read Table 4 in LibN3L: A lightweight Package for Neural NLP.

Compile

cmake .
make

Example

This example shows how to train three Chinese word segmentation models for the pku corpus of the Sighan Bakeoff 2005 dataset.
These models are

  • SparseCRFMMLabeler which only considers the sparse features and works like a CRF model
  • LSTMCRFMMLabeler which only uses neural embeddings as input and employs CRF Maximum Likelihood as training objective.
  • SparseLSTMCRFMMLabeler which supports both neural embeddings and sparse features and also employs CRF Maximum Likelihood as training objective.

This example data contains

For more details about the example, please read the example "readme.md".