/jp

Primary LanguagePerl

FLOW IN: data/*.txt

  1. http://lindat.mff.cuni.cz/services/morphodita/api/tokenize per file;
  2. http://lindat.mff.cuni.cz/services/morphodita/api/tag -> morp_temp; http://lindat.mff.cuni.cz/services/udpipe/api/process -> udpipe_temp;
  3. find whole sentences with problematic tokens inside of udpipe_temp (contains tokens with id in format \d-\d), and extract -> edited/ (before)
  4. edit sentences and generate new d tree http://lindat.mff.cuni.cz/services/parsito/api/parse -> edited/#parsito (after)
  5. substitution of origin sentences for edited sentences in udpipe_temp -> udpipe_temp2
  6. check and merge columns of morp_temp with udpipe_temp2 -> t1