farbenmeer/dbpedia

Escape chars

Closed this issue · 2 comments

Hello! I'm having a small problem using dbpedia gem. A snippet like that:

@bio = Dbpedia.search(@author.full_name).collect(&:description)

Should download biography of an author. It does, but it looks like this:

["\n Adam Bernard Mickiewicz) was a Polish poet, publicist and political writer. A prime representative of the Polish Romantic period, he is one of that country's Three Bards and the greatest poet in all Polish literature. He is also considered one of the greatest Slavic and European poets. He has been described as a "Slavic bard". He was a leading Romantic dramatist and has been compared in Poland and in western Europe to Byron and Goethe.\n ", "", "", "\n The Adam Mickiewicz Institute (Polish: Instytut Adama Mickiewicza) is a government-sponsored organization funded by the Ministry of Culture and National Heritage of Poland. Its goal is to promote Polish language and Polish culture abroad. It is based in Warsaw. The Institute operates a bilingual Polish-English portal called "culture. pl" created in 2001.\n ", "\n The Mickiewicz Battalion was a volunteer battalion of the International Brigades during the Spanish Civil War. It formed part of the XIII International Brigade from 27 October 1937 until 23 September 1938, when the International Brigades were disbanded. It was named after Adam Mickiewicz (1798–1855), a Polish poet and patriot.\n "]

Many escape chars got in the way, and i don't know how to get rid of them. Also, is there any way to get localized data with this gem? Since pl.dbpedia is down, and international one has biopgraphies in Polish, I just dont know how to get to them. Any help would be appreciated.

Also kudos on this great little gem!

pex commented

Apparently this is still a work in progress and a find method for known dbpedia resources still needs to be implemented (see README#milestones).

For now you might want to work around the following line:

@bio = Dbpedia.search(@author.full_name).first.description.strip

In fact every result description should be strip(ed) right away by the gem.
If the first result is not the right one, you need to think about an efficient matcher. For example if you know the nationality of each author, you could use that to find the right result:

@bio = Dbpedia.search('Adam Mickiewicz').find { |a| a.description =~ /polish/i }.description.strip

And thanks for your interest in the gem

Thank you for the tips. I had some success following your tips, but that still wasn't what I needed. The find method with a proper matcher was working almost perfectly, but i still had no way to get the results in polish. I did some research and it seems it's a limitation of keyword search (example: http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryClass=person&QueryString=Juliusz_S%C5%82owacki returns content in english).
Ultimately I achieved my goal using more robust SPAQL query. I'm leaving the code below in case someone still learning can see an example of how to use spaql queries with the dbpedia gem:

results = Dbpedia.sparql.query("SELECT ?person ?abstract ?name WHERE {
?person rdf:type <http://dbpedia.org/ontology/Person>.
?person foaf:name ?name.
?person dbpedia-owl:abstract ?abstract .
FILTER (LANG(?abstract)='pl' && ?name=\""+@author.full_name+"\"@en)
}")
if results.first != nil
@bio=results.first[:abstract]

Thank you for taking time to respond Pex, and for your work on this gem. Cheers!