
The work in pris

Primary LanguagePython

Relation Extraction

This project do relation extraction work by Convolutional Neural Network
The tool we used is Tensorflow

This project includes:

  1. pre-processing python file of ACE data, SemEval data and Chinese data
  2. python file for relation extraction by CNN architecture

File intorduction:


preceprocess_ace.py -- main function to extraction structure data from ace and save it as embedding form
c_xml_parse.py -- class for parse xml
c_ace_process.py -- classes for ACE Data process, includes definition of data structure, and parse function
c_relation.py -- class for turning relation mention into embedding
c_open_tool.py -- class for other tool, includes load position matrix and word matrix and so on.


data2vector.py -- load the trained word2vector embed
load_data.py -- word segmentation and PF value counting, to change the raw data to [word_id, pf_id, relation_id] form


cws.model is the segmentation model provided by LTP
vocab.txt is the dictionary which can be used in segmentation
vectors.txt is the trained chinese word embedding by word2vector tool


the relation mention we extracted from raw data is the form as follow:
Ivan in Wisconsin PHYS Located
entity are tagged followed by relation tag