itkach/slob

Use OmegaWiki dictionary data

Closed this issue · 3 comments

OmegaWiki is a collaborative project to produce a free, multilingual dictionary in every language, with lexicological, terminological and thesaurus information.

The software is opensource and the data is free.

The key idea of OmegaWiki is to be based around concepts. This is what makes it truly multilingual.
So, by building a French-English and German-English dictionary, we are also building a German-French dictionary. If we add an Italian contributor, we build 3 more bilingual dictionaries... this is exponential.

http://www.omegawiki.org/

Not sure where this wish belongs, so I also posted it to itkach/aard2-android#24

I'm going to reply here since I don't see this request posted in mailing list (http://aarddict.org/forum). I tried running mwscrape against OmegaWiki site and it kind of works, but the site appears to be painfully slow and, more importantly, the list of titles reported by it's MediaWiki API is weird, for example Apertium English wordlist/A through Apertium English wordlist/Z and International Beer Parlour/Archive1 through International Beer Parlour/Archive9 and Machine Translation preparation A through Machine Translation preparation Z. After 557 documents mwscrape stopped because it seems that's all there is. Looks like MediaWiki is used in some interesting way which prevents simple "get list of titles - get rendered HTML for each title" API usage that allows to generate slob dictionaries. For this data to be usable for offline dictionaries OmegaWiki maintainers need to provide some reasonable way to access it.

Sorry I did not sign up for the mailing list.

As described on http://www.omegawiki.org/Help:Downloading_the_data, if you are interested in the lexical data, and want to retrieve them directly with a SQL query (instead of with a web interface), you need the Lexical dump. http://www.omegawiki.org/downloads/omegawiki-lexical.sql.gz (compressed: ~35MB ; uncompressed: ~175MB). Or you can download from http://www.omegawiki.org/Special:Ow_downloads