EU project 732328: "Fashion Brain".
D2.1: "Named entity recognition and linking methods".
git clone https://github.com/eXascaleInfolab/fashionNLP.git
cd fashionNLP/
The ”fashionnlp” package contains the following files:
- updateFBT.py: This code performs the following tasks:
- Take a concept present in WikiKB as input.
- Find the mentions of this concept and its similar concepts (using String matching and tree search) in instagram posts.
- Find if the WikiKB concept is present in FBtaxonomy. If not, update the FB taxonomy.
- wikitaxonomy.py: This file is used to generate the wikipedia taxonomy from the wikipedia categorisation https://en.wikipedia.org/wiki/Category:Clothing_by_type
- input folder: This folder contains the following input files:
- FBTaxonomy.csv: The initial FashionBrain taxonomy in a csv format.
- Find the mentions of this concept and its similar concepts (using String matching and tree search) in instagram posts.
- FBTaxonomy.csv: The initial FashionBrain taxonomy in a json format.
- ner_posts.csv: This file contains the output result of applying SENNA on the instagram posts.
- wikipediaKB.json: This file contains the wikipedia knowledge base of fashion items in a json format.
- result folder: This folder contains the updated taxonomy
Download StandfordCoreNLP and add it in this directory.
https://stanfordnlp.github.io/CoreNLP/history.html
In order to run an experiment, the updateFBT.py file is used. The corresponding command line to run the code is :
pip install stanfordcorenlp
pip install python-levenshtein
python updateFBT.py