itkach/slob

[Bugs / Feature suggestion] Outdated and additional links to existing slob files

Closed this issue · 10 comments

Olf0 commented

Hello @itkach, @MHBraun and @francwalter,

many of the slob-files linked to on the GitHub-webpage https://github.com/itkach/slob/wiki/Dictionaries
have newer versions available at http://ftp.halifax.rwth-aachen.de/aarddict/.
See list below for some examples; note, that this list is not exhaustive (i.e. incomplete).

Furthermore there are slob-files available at ftp.halifax/aarddict, which are not listed at all on your dictionaries list for Aard2, e.g. the alswiki.

Then there are a few outdated slob-files at ftp.halifax/aarddict, which have newer versions available on the dictionaries list. Unfortunately I am not able to contact Markus Braun (@MarkusHBraun) via his preferred communication channel (aarddict Googlegroup) in order to notify him, as I do not use Googlegroups. Maybe it makes sense for you all to communicate in order to keep both locations in sync and up-to-date.

Also note that some of the links to slob-files hosted on https://mega.co.nz/ (or https://mega.nz/) lead to unavailable files on Mega, e.g.
en-m-wikivoyage-org-20141125.lzma2.slob https://mega.co.nz/#!rdAUXJxZ!-HqEzprdigPpSplR-9AWjxrdkVKe6_OoRgRJ7PdZ0_0,
de-m-wikivoyage-org-20141125.lzma2.slob https://mega.nz/#!KN5g0AxL!U8UitqlxFGV9h09W_lSTgNHU4rSeIxK21QZYIJbK9pY and
pt-m-wikivoyage-org-20141124.lzma2.slob https://mega.nz/#!bAQ3DLTC!291ojqdjuADmbnbZvjQMeUjM2W2HSmeNc0jynxO60EM.

And a few links use an URL-shortener (bit.ly or goo.gl), which is an unnecessary indirection obscuring the original URL: Please expand them to their full URLs.
Examples are all <??wiktionary-20160526.lzma2.slob> and all <??-m-wiktionary-org-2015012?.lzma2.slob> files.
Special cases with extra caveats are:

As it is crucial for the Aard2 "ecosystem" to make as many as possible and recent slob-files easily accessible for Aard2 users, keeping https://github.com/itkach/slob/wiki/Dictionaries (which fulfills this task well) up-to-date is IMO important.

Kudos for your excellent work on Aard2 (and its slobs), the best dictionary software I am aware of.

P.S.: Some examples of slob-files, which have an older version listed on https://github.com/itkach/slob/wiki/Dictionaries and a newer one available at http://ftp.halifax.rwth-aachen.de/aarddict/.
dewikibooks-20160704.slob
dewikibooks-20161118.slob
dewikivoyage-20160212.slob
dewikivoyage-20160705.slob
dewiktionary-20160708.slob
dewiktionary-20161224.slob
enwikiquote-20150214.slob
enwikiquote-20160504.slob
enwikivoyage-20150822.slob
enwikivoyage-20160215.slob
simplewiki-20150303.slob
simplewiki-20170118.slob

Unfortunately I am not able to contact Markus Braun (@MHBraun) via his preferred communication channel (aarddict Googlegroup) in order to notify him, as I do not use Googlegroups.

That is your choice, of course, but you are also excluding other users who may have contributed dictionary links in the past and may find your posts relevant.

And a few links use an URL-shortener (bit.ly or goo.gl), which is an unnecessary indirection obscuring the original URL: Please expand them to their full URLs.

Dictionary list, by design, is a collaborative effort, with links contributed by individual users, and so contributors get to decide where they want to host content they created and how they want to refer to it. URL shorteners allow to collect some access stats, are easier to share and in some cases may serve as permanent links to changing content. While I myself don't think shortened urls are all that useful, I don't think they should be banned either.

As for broken short links for English and German wikipedia pointing to Mega at http://aarddict.org/1 - I will remove those, the content is available elsewhere. Thank you for pointing that out.

Also note that some of the links to slob-files hosted on https://mega.co.nz/ (or https://mega.nz/) lead to unavailable files on Mega, e.g.

Perhaps opening a separate, specific issue and making sure to mention Github user (e.h. @MHBraun) who added the no-longer-working link (should be possible to determine from page edit history) is a good way to notify the author. If author is unable to correct the link or is not responding, it should be removed. Any Github user may edit the dictionary list. There's a problem, however - this is issue tracker for source code repository, and I don't really want linking issues to be posted here. Maybe it's time for reorganization and moving dictionary list to a separate repo. I'll think about that.

many of the slob-files linked to on the GitHub-webpage https://github.com/itkach/slob/wiki/Dictionaries
have newer versions available at http://ftp.halifax.rwth-aachen.de/aarddict/.
Furthermore there are slob-files available at ftp.halifax/aarddict, which are not listed at all on your dictionaries list for Aard2, e.g. the alswiki.

perhaps a generic link to http://ftp.halifax.rwth-aachen.de/aarddict/ encouraging users to browse the directory is a good enough solution?

@Olf0 I dont think that concerns me, because the links to frwiki and frwiktionary are going to my server from where mhbraun syncs them to halifax. Beside of frwiki and frwiktionary I don't link to wikis.

Olf0 commented

@itkach: Yes, a " generic link to http://ftp.halifax.rwth-aachen.de/aarddict/" would be a good first measure.

@francwalter: Oh, I see. I just addressed you also, because you are one of the regular editors of this page. Sorry for that.

BTW, the decryption of (encrypted) files on Mega.nz / Mega.co.nz fails on non-x86 (e.g. mobile) devices in e.g. Firefox (for Android devices one may use the "Mega app", but I and presumably many others refrain from installing it).
Download locations, which are accessible for all devices (e.g. on Google drive), would be very helpful for mobile users.
Affected are e.g.:

Special thanks for the numerous, newly added freedict.org dictionaries. They make Aard2 very useful!

Thank you for your intensive investigation, Olf0.

It is correct, that
https://github.com/itkach/slob/wiki/Dictionaries is not always updated as
http://ftp.halifax.rwth-aachen.de/aarddict/.

I had a lot of fun and spent an awful lot of time in creating a semi automatic system to create the datafiles for the enwikipedia and for the dewikipedia. The http://ftp.halifax.rwth-aachen.de/aarddict/ folder is a mirror of my harddisk I am sharing with all using Aard and Aard2. Some processes are automatically, others are not. The one which is not, is the creation of the bittorrent link and the update of the dictionary on github. And sometimes I just forget to update the dictionary whilst the new data is already populating the rwth server. It should not be this way, but it does occur, I do admit.
The alswiki is the Alemannisch Wiki where most of the people would not even understand their content, I guess. This is my test wiki for changes in the scripts and in the testing area... And kind of fun for myself.
I never thought it may be of use for anybody else, therefore I did not update the dictionary on github with the information.

I am not aware of outdated slob files on the rwth server albeit the "archive" section. If you may direct me into the right direction I may save some space on the harddisk.

As you do not read the common and agreed information area on Googlegroups you are of course not aware that I closed the storage area of Mega in favour of the rwth server. hence you are corrrect with your observations, but this has been communicated already.

It is your decision but I do want to encourage you to read the googlegroup. This is where we usually post and discuss items. The area here in github is more dedicated to slob file system questions. Unfortuantely your valuable contributions will not be seen by other users of the googlegroup.

I will continue to create the dictionnaries and will make it available. Just have a look at Googlegroup. Every update I create is posted there automatically by AardFeed

Olf0 commented

Dear @itkach and @MarkusHBraun,

sorry, if there was a misunderstanding:
a. I sure read http://aarddict.org/forum/ (https://groups.google.com/forum/?#!forum/aarddict), just did not intend to create a Google account to post there.
b. I addressed you in my original message, which was primarily a bug report for broken and problematic download links, because I assumed you are the maintainers of this page / dictionary list. Mentioning the perceived lack of a maintenance process in place was rather a side note, due to the importance of having a broad range of recent dictionaries available for Aard2's success.

Now that you (@itkach) clearly state, that this page / list is supposed to be "community maintained", a process cannot be employed (due to the lack of a specific maintainer role as an authority). But at least some more guidance about the use of this page (for creators of slob-files) is still needed, IMO.
So I will try to create a few more bullet points for the introduction of this page ("When adding new links:") the next couple of days, if this is O.K. for you.

And WRT "outsourcing" this page: I believe it shall stay in close proximity to Aard2 and its dictionary tools, so leaving it under the umbrella of https://github.com/itkach is best. Maybe a new repository with just this Wiki page provides enough separation for you.

And in the end it seems that this was also the right place for discussing the, as the issue was not about dictionaries, but the page / dictionary list as such.

Cheers!

Olf0 commented

Dear @MarkusHBraun,

as you asked: The French Wikipedia @francwalter linked at https://github.com/itkach/slob/wiki/Dictionaries#french is slightly newer than the ones in http://ftp.halifax.rwth-aachen.de/aarddict/frwiki/. Never mind, now that I understood that this is not an coordinated effort.
If you intend to save space on Halifax, you may also delete <dewikibooks-20160704.*> in http://ftp.halifax.rwth-aachen.de/aarddict/dewiki/, as you already have <dewikibooks-20161118.*> there.

And please do not underestimate the importance of your work:
Without an appropriate (for each potential user) set of dictionaries a dictionary software is simply useless!
And it is basically you three creating these dictionaries.

BTW, I did have fun browsing the alswiki for a while, but as I primarily do not use Aard2 as a local Wikipedia browser, but for dictionaries (Wiktionaries) and word translators (this is why I am so happy about @itkach's recent freedict additions), I ended up deleting it.

Cheers!

Olf0 commented

Was living in Palatinate for a while, had friends in Freiburg back then and visited the Alsace now and then. But even https://pfl.wikipedia.org and https://pdc.wikipedia.org are just fun to browse for me once in a while.
In contrast to that, offline dictionaries (here: Wiktionaries and Freedict word translations) are absolutely essential on a mobile device, as there may be no network available, when urgently needed.

Olf0 commented

Finally found the time & words to expand the "When adding new download links:" section of the wiki page.
Hoping that the added content is O.K.for you, thus closing this issue.