/WITYPI

WITYPI uses Wikipedia and machine learning for unsuppervised terminology extraction

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

WITYPI

WI kipedia

T erminolog Y

PI cker

PyPI status PyPI status PyPI status


ABOUT WITYPI

WITYPI is a Python project with the aim to automatically design a terminology by using the Wikipedia's DB (kind of unsuppervised learning).

On Wiki, categories are linked together, and pages belong to these categories.

By creating a network graph between categories and applying TF-IDF on the vocabulary contained in all pages of every categories, we can extract important vocabulary for every class.

INSTALLATION on UNIX systems

First, create a virtual environnement.

virtualenv -p /usr/bin/env python3 WITYPI
source /WITYPI/bin/activate

Then, by using pip3 after sourcing your virtualenv:

pip3 install -r requierement.txt

CONFIGURATION

LipSuM

LAUNCH

Then, simply launch

python3 __main__.py