/pedit

programs for editing and analyzing texts

Primary LanguageCGNU General Public License v3.0GPL-3.0

pedit

Tools and Programs for text manipulation and editing.

tools

trslt

Transliteration tool in order to replace characters to change encoding.

todo

  • read from input file, write to output file
  • allow multibyte keys: automaton search. If key contains subkey with same

start, prioritize longer keys. In other cases of ambiguity, prioritize keys later in file.

pedit

Structure

  • sqlite database
  • c library
  • cli interactive tools
  • user interface (tui & gui)

cli tools

text init

In: text file, word separators file (for the first test version)

  • initialize database if no database yet
  • create text and text version
  • create node + text items
  • create word part and word for each text item

word editor

Edits and rearranges words:

  • delete, add, modify words
  • edit pre-word and post-word
  • split text item into words
  • combine word parts of different words into a single word

dictionary editor

Edit dictionary objects: lemmas, categories and values, word classes…

morphological text analyzer

For each word, specify lemmas and movin category values

text version importer

Insert for a given text additional text versions. Use LCS or Ukkonen. Requires a separators file.

Racket GUI

  • Init module: done 2017-09-01

create database

  • Text module: done 2017-09-01

create languages, texts and text versions

  • Text Version module
    • display one or several text versions
    • add/remove nodes
    • add/move/rename/remove labels
    • add/modify/remove text items
    • edit cell data (pre, post)
    • horizontal module (tv in rows, nodes in columns)
    • vertical module (tv in columns, rows are lines containing one or more

    text items without fixed length)

    • permutations
      • nodes in different order
      • word parts in different order

Language module

edit categories, word classes, lemmas

Other functions

  • merge/split word parts in text items
  • assign word parts to words
  • analyze text by assigning lemmas and moving categories to words
  • import a text: create text, text version, nodes, text items, words, word

parts…

  • import a text version for an existing text

November 2017 to do items

Important for November

  • add/modify text items
  • delete nodes
  • add permutations and permutation editor
  • simple text import (at the end or first, without comparing with parallel

text)

  • text import with comparison with other texts
  • text export 1 line per node
  • format of text: size and font
  • windows standalone exe file
  • improve user interface
  • documentation
    • reasons for application
    • how to use it
    • how it works
  • add documentation to github wiki

Items that can be done later

  • Support for dictionary: edit word classes, categories
  • Assign lemmas and categories to words
  • Import: use dictionary
  • Export: fill as much as possible in the export table
  • Find ways to merge files…

plan

  • text edit (for 11-01)
  • permutations (for 11-04)
  • import / lcs (for 11-07)
  • export (for 11-09)
  • windows (for 11-10)
  • docs (for 11-13)