dbpedia/GSoC

Extend the Extraction Framework for your language

Opened this issue · 4 comments

mgns commented

Effort

1-2 days

Skills

basic maven, scala

Description

The DBpedia extraction framework has a default configuration that is language agnostic. However, language specific configuration can boost the coverage and precision of the extracted data for that particular language. We keep all language specific configurations here. Browse through the code and try to see how you can improve existing languages of provide configuration for a new one.

Impact

Improvements in the data quality & quantity for a particular language

I created a pull request in reference to this warm-up task.

see e.g. also dbpedia/extraction-framework@f60edd4 which is not yet merged to master but shows some language specific configuration for the number parser

Ok? Should I check the language specific configurations made in dbpedia/extraction-framework@f60edd4 if they are correct or not?

no it was just an additional note. in case other people do use this as warmup task as well that they do not duplicate the work already done ;-). luckily in your case that did not happen