lstf_crf模型,用于地址标准化任务,版本V4,由6074条样本训练 recognizer.py是主程序,py2.7环境运行,调用格式如下: python2 recognizer.py --inputs='太阳宫中路8号冠捷大厦3层302Boss直聘' --predict_mode=0(default)/1 省缺inputs参数为测试模式,输出测试输入的分析结果;省缺predict_mode参数为使用crf_frozen_ckpt.pb模型,其他参数均可指定,详情见recognizer.py 返回解析的字符串如下: [STREET 太/B-STREET 阳/I-STREET 宫/I-STREET 中/I-STREET 路/E-STREET] [STREETNUM 8/B-STREETNUM 号/E-STREETNUM] [LANDMARK 冠/B-LANDMARK 捷/I-LANDMARK 大/I-LANDMARK 厦/E-LANDMARK] [FLOOR 3/B-FLOOR 层/E-FLOOR] [TABLET 3/B-TABLET 0/I-TABLET 2/I-TABLET B/I-TABLET o/I-TABLET s/I-TABLET s/I-TABLET 直/I-TABLET 聘/E-TABLET] 识别出来的实体由[]标识,[]内第一个字符串是该实体的类别,后面每个字符串表示该实体包括的字/标签;未识别的实体在[]之外,表示两种情况:1、错误标记,2、正确标记'O'标签 文件结构: root ├── recognier.py // main script, containing all configurations in the head ├── common // site-packages │ ├── bilstm.py // bilstm impletation │ ├── crf_frozen_graph.py // prediction script by crf_frozen_ckpt.pb │ ├── generate_prediction.py // mapping segmented characters to features │ ├── modify_conditions.py // quering prior conditions of labels │ ├── sentence.py // assisting to map segmented characters to features │ ├── tag_merge.py // decoding output of prediction into standard output │ └── viterbi_frozen_graph.py // prediction script by viterbi_frozen_ckpt.pb ├── lib // assistance tools │ ├── prior_conditions.txt // prior conditions dictory of labels │ └── vec.txt // words embedding model, traind by word2vec, sampling from People's daily Feb.- Apr. 2014 ├── model // trained models │ ├── crf_frozen_ckpt.pb // model with rewritten crf decode function, setting 'predict_mode' = '0' to use it, default │ └── viterbi_frozen_ckpt.pb // model with rewritten viterbi decode function, setting 'predict_mode' = '1' to use it └── utils // rewritten tensorflow source codes (key words: '_with_conditions') ├── crf_rewrite_crf.py // containing rewritten crf decode, replace `tensorflow/contrib/crf/python/ops/crf.py` and rename ├── crf_rewrite_viterbi.py // containing rewritten viterbi decode, replace `tensorflow/contrib/crf/python/ops/crf.py` and rename ├── __init__.py // containing new import statement, replace `tensorflow/contrib/crf/__init__.py` └── rnn_rewrite_crf.py // containing rewritten dynamic rnn and rnn loop, replace `tensorflow/python/ops/rnn.py` and rename