IEPY is an open source tool for Information Extraction focused on Relation Extraction.
To give an example of Relation Extraction, if we are trying to find a birth date in:
"John von Neumann (December 28, 1903 – February 8, 1957) was a Hungarian and American pure and applied mathematician, physicist, inventor and polymath."
then IEPY's task is to identify "John von Neumann
" and
"December 28, 1903
" as the subject and object entities of the "was born in
"
relation.
- It's aimed at:
- users needing to perform Information Extraction on a large dataset.
- scientists wanting to experiment with new IE algorithms.
- A corpus annotation tool with a web-based UI
- An active learning relation extraction tool pre-configured with convenient defaults.
- A rule based relation extraction tool for cases where the documents are semi-structured or high precision is required.
- A web-based user interface that:
- Allows layman users to control some aspects of IEPY.
- Allows decentralization of human input.
- A shallow entity ontology with coreference resolution via Stanford CoreNLP
- An easily hack-able active learning core, ideal for scientist wanting to experiment with new algorithms.
Install the required packages:
sudo apt-get install build-essential python3-dev liblapack-dev libatlas-dev gfortran openjdk-7-jre
Then simply install with pip:
pip install iepy
Full details about the installation is available on the Read the Docs page.
The full documentation is available on Read the Docs.
IEPY is © 2014 Machinalis in collaboration with the NLP Group at UNC-FaMAF. Its primary authors are:
- Rafael Carrascosa <rcarrascosa@machinalis.com> (rafacarrascosa at github)
- Javier Mansilla <jmansilla@machinalis.com> (jmansilla at github)
- Gonzalo García Berrotarán <ggarcia@machinalis.com> (j0hn at github)
- Franco M. Luque <francolq@famaf.unc.edu.ar> (francolq at github)
- Daniel Moisset <dmoisset@machinalis.com> (dmoisset at github)
You can follow the development of this project and report issues at http://github.com/machinalis/iepy