distributed-text-services/workshops

Hack Idea: Convert existing DBNL.nl TEI corpora to DTS compliance

Closed this issue · 1 comments

epoz commented

Brief Description

The Dutch Digital Library for literature contains a substantial collection of TEI encoded texts. It is possible to download all the rights-free texts in a single zip file
An idea is to explore and document what is necessary to make such a collection available as a DTS compliant collection. If needed, what scripts need to be developed to transform the texts, and possible should alternative DTS tools be developed to expedite the process

Coding skills needed

The initiator of this suggestions prefers to code in Python, it needs to be seen how much coding is needed. It might be more a matter of documentation and text analysis than coding.

epoz commented