This project provides free (even for commercial use) state-of-the-art information extraction tools. This first release includes a named entity recognizer. Subsequent versions will add tools for part of speech tagging, relationship extraction, interfaces for training your own custom extractors, and various other tools.
The first release of MITIE includes only a C API. However the next releases will add easy to use bindings to other langauges beginning with python and java.
The MITIE C API is documented in the mitie.h header file. There is also an example NER program that shows how to use it in the examples folder.
If you obtained MITIE by cloning the main repository then you must first fetch the
submodules (dlib). Do this by running fetch_submodules.sh
.
Then, to compile the examples type the following command:
make examples
You can download example models trained on English texts here:
make MITIE-models
These models are required for the examples below:
./ner_example MITIE-models/ner_model.dat sample_text.txt
This command runs ner_example
using the ner_model.dat
sample
model. A summary of the entities found in sample_text.txt
will be
printed out to the STDOUT
.
Alternatively, you can tell MITIE to process each line of a text file independently and output marked up text with the command:
cat sample_text.txt | ./ner_stream MITIE-models/ner_model.dat
You can also run a simple regression test to validate your build, run the following command:
make test
make test
builds both the example programs and downloads required
example models. If you require a non-standard C++ compiler, change
CC
in examples/C/makefile
and in tools/ner_stream/makefile
.
The above works on most Unix-like systems. For Windows and other platforms we have provided CMake build scripts. To compile using CMake you would use this alternative set of commands:
cd examples/C
mkdir build
cd build
cmake ..
cmake --build . --config Release
Finally, you can create a MITIE shared library by executing:
cd mitielib
make
And again, ths will work on most Unix systems but if you are on a platform where it doesn't you can use the provided CMake files in the mitielib folder. So type the following to compile MITIE as a shared library using CMake:
cd mitielib
mkdir build
cd build
cmake ..
cmake --build . --config Release
We have built binaries packaged with sample models for Macos and Linux (64 and 32-bit). These packages are downloadable here:
MITIE is licensed under the Boost Software License - Version 1.0 - August 17th, 2003.
Permission is hereby granted, free of charge, to any person or organization obtaining a copy of the software and accompanying documentation covered by this license (the "Software") to use, reproduce, display, distribute, execute, and transmit the Software, and to prepare derivative works of the Software, and to permit third-parties to whom the Software is furnished to do so, all subject to the following:
The copyright notices in the Software and this entire statement, including the above license grant, this restriction and the following disclaimer, must be included in all copies of the Software, in whole or in part, and all derivative works of the Software, unless such copies or derivative works are solely in the form of machine-executable object code generated by a source language processor.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.