/affix-count

simple count of prefixes, suffixes, and infixes, in the text of a plain file or xml file, with some allomorphy

Primary LanguagePythonOtherNOASSERTION

affix-count

simple counter of prefixes, suffixes, and infixes in a text or xml file, with some allomorphy

For plain text:

python3 affcount3.py filename sequence of affixes

For text in xml files:

python3 affcount3-xml.py filename sequence of affixes

where filename is the name of the text file, and

sequence of affixes is space-separated prefix, suffix or infix.

Prefixes end with dash, suffixes start with it, and infixes start and end with it, for example re-, -ness, -bloody-. We remove the dash before we search the forms.

affcount2.py is for Python 2, and affcount3.py for Python 3, whatever that means in a messy PL. I assume here that python3 calls python 3, and plain python calls python 2. Change them according to your system.

Examples:

python affcount2.py README.md -xes -er file- -fix-

python3 affcount3.py README.md -xes -er file- -fix-

python3 affcount3-xml.py fn.xml -xes -er file- -fix-

python affcount2.py fn -ler -lar in a Turkish text file fn, will print allomorphic count of the plural.

Caveat: These aren't really allomorphic counts because we don't do anything with semantics. We just look at the position of the "affixal form" in a word. For example, the word `kiler' (cellar) in Turkish is not plural, although -ler looks like a "suffix" and counts as such by the program.