hbz/digitalisiertedrucke

Repair URLs for bigger collections

Closed this issue · 9 comments

For the big compact memory collection the chance don't look good.

Example http://beta.digitalisiertedrucke.de/resources/D34000:
broken fulltext link: http://www.compactmemory.de/index_p.aspx?tzpid=12&ID_0=12&ID_1=215&ID_2=9446&ID_3=29342
Current workin fulltext link: http://sammlungen.ub.uni-frankfurt.de/2861615

Replaced the URLs where possible for the collections which have over 50 hits (shown after click on "Enthaltene Titel anzeigen"). It wasn't possible for documentation / project / searching pages / homepages etc. which don't have an equivalent anymore or URLs which don't have any description and aren't self-explanatory. Where is wasn't possible to replace the URLs I left them as they where.
Because the edited file is bzipped again it isn't possible to do a git diff but I have a table on my local computer where I noted the changes in short form.

Two bigger collections have changed their system for referencing titles in the collection: The above mentioned compactmemory.de, now at the Digital Library of the Goethe University Frankfurt and literatur-des-judentums.de which is also at the Digital Library of the Goethe University Frankfurt (different link).
Compactmemory had formerly combined ids (for example http://www.compactmemory.de/index_p.aspx?tzpid=68&ID_0=68&ID_1=982&ID_2=27356&ID_3=78917) and now only one completely new (for example http://sammlungen.ub.uni-frankfurt.de/cm/periodical/titleinfo/377570). It would be easier to drop the old datasets and getting new one than thinking about how to transfer the links to the bibliographic descriptions.
We have OPAC-entries for the literatur-des-judentums.de but they don't refer to the HEBIS-Verbundkatalog (for example, http://www.literatur-des-judentums.de/opac/?ppn=013823973). We could, however, take the "HEBIS number" and replace the old URLs with URLs to the Verbundkatalog.

@ChristophEwertowski Perhaps it makes sense to attach the table as a CSV file here.

Since it doesn't belong in the repository I will add it as a .txt file in this comment. csv isn't permitted but whoever wants can simply change the file ending and edit it as a csv file. Because I wasn't thinking of publishing it, the notes to the changed URLs are in German.
51-ersetzte_Links.txt

I think the improvements @ChristophEwertowski made are a good start. Please go ahead and commit your changes.

Regarding the two collections that have changed IDs we might replace the vroken link by a search with the document title, e.g. http://sammlungen.ub.uni-frankfurt.de/cm/search/quick?query=%C3%9Cber+das+Wort+avatiga for http://beta.digitalisiertedrucke.de/resources/D34000. It leads directly to the document – at least in this case...

As an addition to our offline discussion: The PPN or PICA production numbers aren't used for the frontend or in the link of the fulltext.

For the Compact Memory collection it's easier than for the the other collection because in Compact Memory there are only journals which mostly have different titles. Nevertheless a search for the title is better than nothing.

The situation is much better than before. +1