Repair URLs for bigger collections

Question

Repair URLs for bigger collections

Closed this issue 8 years ago · 9 comments

URLs from some collections are broken. We probably can repair a lot of them systematically.
Examples:

Answer 1 · 2017-02-07T09:30:22.000Z

For the big compact memory collection the chance don't look good.

Example http://beta.digitalisiertedrucke.de/resources/D34000:
broken fulltext link: http://www.compactmemory.de/index_p.aspx?tzpid=12&ID_0=12&ID_1=215&ID_2=9446&ID_3=29342
Current workin fulltext link: http://sammlungen.ub.uni-frankfurt.de/2861615

Answer 2 · 2017-02-09T13:51:34.000Z

Replaced the URLs where possible for the collections which have over 50 hits (shown after click on "Enthaltene Titel anzeigen"). It wasn't possible for documentation / project / searching pages / homepages etc. which don't have an equivalent anymore or URLs which don't have any description and aren't self-explanatory. Where is wasn't possible to replace the URLs I left them as they where.
Because the edited file is bzipped again it isn't possible to do a git diff but I have a table on my local computer where I noted the changes in short form.

Two bigger collections have changed their system for referencing titles in the collection: The above mentioned compactmemory.de, now at the Digital Library of the Goethe University Frankfurt and literatur-des-judentums.de which is also at the Digital Library of the Goethe University Frankfurt (different link).
Compactmemory had formerly combined ids (for example http://www.compactmemory.de/index_p.aspx?tzpid=68&ID_0=68&ID_1=982&ID_2=27356&ID_3=78917) and now only one completely new (for example http://sammlungen.ub.uni-frankfurt.de/cm/periodical/titleinfo/377570). It would be easier to drop the old datasets and getting new one than thinking about how to transfer the links to the bibliographic descriptions.
We have OPAC-entries for the literatur-des-judentums.de but they don't refer to the HEBIS-Verbundkatalog (for example, http://www.literatur-des-judentums.de/opac/?ppn=013823973). We could, however, take the "HEBIS number" and replace the old URLs with URLs to the Verbundkatalog.

Answer 3 · 2017-02-10T09:12:42.000Z

@ChristophEwertowski Perhaps it makes sense to attach the table as a CSV file here.

Answer 4 · 2017-02-10T14:48:10.000Z

Since it doesn't belong in the repository I will add it as a .txt file in this comment. csv isn't permitted but whoever wants can simply change the file ending and edit it as a csv file. Because I wasn't thinking of publishing it, the notes to the changed URLs are in German.
51-ersetzte_Links.txt

Answer 5 · 2017-02-15T10:02:00.000Z

I think the improvements @ChristophEwertowski made are a good start. Please go ahead and commit your changes.

Regarding the two collections that have changed IDs we might replace the vroken link by a search with the document title, e.g. http://sammlungen.ub.uni-frankfurt.de/cm/search/quick?query=%C3%9Cber+das+Wort+avatiga for http://beta.digitalisiertedrucke.de/resources/D34000. It leads directly to the document – at least in this case...

Answer 6 · 2017-02-15T11:19:31.000Z

As an addition to our offline discussion: The PPN or PICA production numbers aren't used for the frontend or in the link of the fulltext.

Answer 7 · 2017-02-15T12:46:41.000Z

For the Compact Memory collection it's easier than for the the other collection because in Compact Memory there are only journals which mostly have different titles. Nevertheless a search for the title is better than nothing.

Answer 8 · 2017-03-02T07:50:31.000Z

Functional review at http://test.digitalisiertedrucke.de/ necessary.

Answer 9 · 2017-03-02T09:29:22.000Z

The situation is much better than before. +1