This package contains the code of the SMAPH system developed by Marco Cornolti, Massimiliano Ciaramita, Paolo Ferragina, Stefan Rued and Hinrich Shuetze.
The SMAPH system links queries to the entities it mentions, disambiguating mentions if needed. Entities are Wikipedia pages. This problem is known as "entity recognition and disambiguation in queries". For example, the query armstrong moon landing should point to Neil Armstrong and Moon Landing, while the query armstrong trumpet should point to Louis Armstrong and Trumpet.
This system won the Entity Recognition and Disambiguation Challenge (short-text track).
SMAPH is built on top of Bing. For this reason, you will need a key to access Bing's API.
The system is deployed as a web service but can also be queried directly from your code.
- Install dependencies
git clone https://github.com/diegoceccarelli/hpc-utils
cd hpc-utils
mvn install -DskipTests
- Download the code
git clone http://github.com/marcocor/smaph-erd
cd smaph-erd
- Set the Bing API key
- obtain a key of the Bing Search API here
cp smaph-config.xml.template smaph-config.xml
- edit smaph-config.xml replacing BING_KEY with your Primary Account Key
- Run smaph:
mvn -Djetty.port=9090 jetty:run
where 9090 is the TCP port your server will be listening to
- Use smaph! You can either:
- access the Json API at http://localhost:9090/smaph/rest/default?Text=armstrong%20moon
- access the debug interface, that will guide you through the steps of the algorithm at http://localhost:9090/smaph/debug.html
You can also access the SMAPH system directly by calling its Java methods. Take a look at the annotateGetFull method in RestService.java to see how it's done.
SMAPH also provides a standard interface as defined by the ERD Challenge 20014. Thid interface is accessible at:
http://localhost:9090/smaph/rest/shortTrack
For any bug you encounter, you can open a bug report on github.
For any enquiry, send an email at x at di.unipi.it (replace x with 'cornolti')
Enjoy, The SMAPH team.