Source code for SIGIR 2019 paper "Hierarchical Matching Network for Crime Classification"
- tqdm==4.31.1
- numpy==1.16.3
- scikit-learn==0.20.3
- jieba==0.39
- torch==0.4.1
- torchtext==0.3.1
We conduct our empirical experiments on two real-world legal datasets:
- CAIL 2018: contains criminal cases published by the Supreme People’s Court. Each case consists of two parts, i.e., fact description and corresponding judgment result (including laws, articles, and charges.
- DPAM: comprises 40,256 criminal cases. These data are crawled from China Judgment Online2 and span from Jan.2016 to June. 2016.
{
"text_len": 16,
"laws": [234],
"textIds": [2935,10,3,330,16,406,2935,1802,2,272,4328,1064,877,818,272,5455],
"parent_class": ["侵犯公民人身"]
}
each instance contains four parts:
- text_len: the length of fact descriptions
- parent class: parent class
- laws: sub class
- textIds: the fact descriptions transformed from text to id
Reproducing the results reported in our paper, please run the code as follows:
python run.py
For more information, please refer to our paper. If our work is helpful to you, please kindly cite our paper as:
@inproceedings{DBLP:conf/sigir/WangFNYZG19,
author = {Pengfei Wang and
Yu Fan and
Shuzi Niu and
Ze Yang and
Yongfeng Zhang and
Jiafeng Guo},
editor = {Benjamin Piwowarski and
Max Chevalier and
{\'{E}}ric Gaussier and
Yoelle Maarek and
Jian{-}Yun Nie and
Falk Scholer},
title = {Hierarchical Matching Network for Crime Classification},
booktitle = {Proceedings of the 42nd International {ACM} {SIGIR} Conference on
Research and Development in Information Retrieval, {SIGIR} 2019, Paris,
France, July 21-25, 2019},
pages = {325--334},
publisher = {{ACM}},
year = {2019},
url = {https://doi.org/10.1145/3331184.3331223},
doi = {10.1145/3331184.3331223},
timestamp = {Sun, 21 Jul 2019 17:52:47 +0200},
biburl = {https://dblp.org/rec/conf/sigir/WangFNYZG19.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}