This is the source code for the little app we created that allows people to browse Google Summer of Code (GSoC) projects.
If you are curious about how we implemented this app, feel free to check our source code.
- Type-ahead suggestion is done via CORS-enabled Ajax queries to DBpedia Lookup. This API takes in some phrase and searches the DBpedia knowledge base to find possible meanings for this phrase. Once you pick one of those meanings, we store its unique identifier (URI) from DBpedia. The client side javascript uses AutoSuggest jQuery Plugin by Drew Wilson.
- Suggestion of related concepts is done via DBpedia's wikiPageLinks and using DBpedia Spotlight's notion of resource relatedness. For each of the URIs you have selected in step 1, we find all concepts linked to them via DBpedia properties. We add to that any other concepts that are "topically similar" according to DBpedia Spotlight.
- Retrieval of projects is done via a SPARQL query over annotated projects stored in our SPARQL endpoint. Projects were annotated with DBpedia Spotlight's Web Service. The resulting data was loaded to Virtuoso triple store, alongside wikiPageLinks dataset of DBpedia.
- Results are displayed by the DataTables jQuery plugin.
DBpedia Spotlight has been selected as an organization for GSoC2012. If you have project ideas involving DBpedia Spotlight, please let us know. Chat with us on Freenode's #dbpedia-spotlight, or through our discussion list at SourceForge.net.
This demo relies on three Web services.
DBpedia Lookup returns tags in the DBpedia knowledge base that match some string. For example, the query below searches for everything containing Berlin:
curl "http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryClass=place&QueryString=berlin"
DBpedia Spotlight models DBpedia "tags" based on their distributional similarity. Therefore we can use their service to give us related tags.
Testing the deployed demo
curl -H "application/json" "http://spotlight.dbpedia.org/related/?uri=Berlin"
Getting the code
https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Installation
Starting the server
mvn scala:run -DmainClass="org.dbpedia.spotlight.web.rest.RelatedResources"
Using the server
curl -H "application/json" "http://localhost:2222/related/?uri=Berlin"
We also use a SPARQL endpoint to query data about GSoC projects. The command below uses cURL to execute a SPARQL query that retrieves all GSoC projects tagged with the string "css".
curl http://spotlight.dbpedia.org/sparql/ -d "query=select * where { ?s <http://spotlight.dbpedia.org/gsoc/vocab#taggedString> \"css\"@en } limit 5"
Please see below how to set up your own SPARQL Server. We will use Apache Jena's Fuseki as an example:
http://jena.apache.org/documentation/serving_data/index.html#download-fuseki
Download data:
wget https://raw.github.com/pablomendes/dbpedia-spotlight-gsoc/master/data/gsoc-projects-2011.nt
wget https://raw.github.com/pablomendes/dbpedia-spotlight-gsoc/master/data/gsoc-projects-2012.nt
Start Fuseki:
./fuseki-server --update --mem /gsoc
Load the data you just dowloaded into the server:
./s-put http://localhost:3030/gsoc/data default gsoc-projects-2011.nt
./s-put http://localhost:3030/gsoc/data default gsoc-projects-2012.nt
Now you should see if your deployment is working:
curl http://localhost:3030/gsoc/query -d "query=select * where { ?s <http://spotlight.dbpedia.org/gsoc/vocab#taggedString> \"css\"@en } limit 5"