Tools and Programs for text manipulation and editing.
Transliteration tool in order to replace characters to change encoding.
- read from input file, write to output file
- allow multibyte keys: automaton search. If key contains subkey with same
start, prioritize longer keys. In other cases of ambiguity, prioritize keys later in file.
- sqlite database
- c library
- cli interactive tools
- user interface (tui & gui)
In: text file, word separators file (for the first test version)
- initialize database if no database yet
- create text and text version
- create node + text items
- create word part and word for each text item
Edits and rearranges words:
- delete, add, modify words
- edit pre-word and post-word
- split text item into words
- combine word parts of different words into a single word
Edit dictionary objects: lemmas, categories and values, word classes…
For each word, specify lemmas and movin category values
Insert for a given text additional text versions. Use LCS or Ukkonen. Requires a separators file.
- Init module: done 2017-09-01
create database
- Text module: done 2017-09-01
create languages, texts and text versions
- Text Version module
- display one or several text versions
- add/remove nodes
- add/move/rename/remove labels
- add/modify/remove text items
- edit cell data (pre, post)
- horizontal module (tv in rows, nodes in columns)
- vertical module (tv in columns, rows are lines containing one or more
text items without fixed length)
- permutations
- nodes in different order
- word parts in different order
edit categories, word classes, lemmas
- merge/split word parts in text items
- assign word parts to words
- analyze text by assigning lemmas and moving categories to words
- import a text: create text, text version, nodes, text items, words, word
parts…
- import a text version for an existing text
- add/modify text items
- delete nodes
- add permutations and permutation editor
- simple text import (at the end or first, without comparing with parallel
text)
- text import with comparison with other texts
- text export 1 line per node
- format of text: size and font
- windows standalone exe file
- improve user interface
- documentation
- reasons for application
- how to use it
- how it works
- add documentation to github wiki
- Support for dictionary: edit word classes, categories
- Assign lemmas and categories to words
- Import: use dictionary
- Export: fill as much as possible in the export table
- Find ways to merge files…
- text edit (for 11-01)
- permutations (for 11-04)
- import / lcs (for 11-07)
- export (for 11-09)
- windows (for 11-10)
- docs (for 11-13)