Diplomatic transcription extraction routine for the Haṭhapradīpikā-editing project.
- xslt 1.0 processor,
- grep, sed, uniq, sort, date
- the
teiHeader
of the witness transcriptions is optimized for direct upload to saktumiva - the csv-database of stemmatically relevant readings can be exported with the matrix editor to recreate the corresponding nexus file, which in turn can be examined with SplitsTree 5.3.0, cf. splitsnetwork
- diplomatic text extraction:
- browse wit_texts directory for the transcriptions,
- run
x-wit-texts-all.sh hp_1.1-20.xml
to recreate from the same input file.
- readings extraction into csv-database:
- check out hp_1.1-20_stemmapoint-readings.csv for the database,
- run
x-wit-readings-csv.sh hp_1.1-20.xml
to recreate from the same input file.
- Changed paragraph elements
<p>
to line group elements<lg>
, added<l>
elements andxml:id
attributes, - purged not yet collated witnesses from
<listWit>
, - added collated witness "YC" to
<listWit>
, - added
wit
attribute with value "ceteri" tolem
elements withoutwit
attribute.
- encode verses in custom environments with references mapped to xml:id attribute:
\usepackage{xparse}
%%% define environments and commands
\NewDocumentEnvironment{tlg}{O{}O{}}{\begin{verse}}{॥#1\hskip-4pt ॥\\ \end{verse}}
\NewDocumentCommand{\tl}{m}{#1}
%%% TEI mapping
\TeXtoTEIPat{\begin {tlg}[#1][#2]}{<lg xml:id="#1">}
\TeXtoTEIPat{\end {tlg}}{</lg>}
\TeXtoTEI{tl}{l}
- Create mappings for commands used in the apparatus, like
\om
:
%%% TEI mapping
\TeXtoTEIPat{\om }{}
- The reading with the most witnesses could be encoded with
ceteri
or a similar shorthand for better readability. This however is only unambiguous under two conditions:
ceteri
can only be used once perapp
-command,- witnesses that omit the given verses all together must be excluded separately from the scope of
ceteri
(not applicable to the current sample?).
- Make sure all witnesses, including -ac and -pc siglas are declared in the preamble of the .tex-file.
- ekdosis is still in development, it might take some time until all features can be utilized to its full potential which should replace some of the workarounds.