-
flatten_conllu.py: A script that takes a dependency treebank in UD format and "flattens" it, that is, applies the following transformations:
- Words with the @conj relation take the label of their head
- Words with the @parataxis relation take the label of their head
-
calculate_accuracy_index.py: A script that does the following:
- Takes -train.conllu file and calculates the table: surface_form - label - frequency
- Takes -dev.corpus and for each token assigns the most frequent label from the table
- Calculates the accuracy index
-
label_asf: A script that takes a sentence in Apertium stream format and for each surface form applies the most frequent label from the labelled corpus.
$ python "string in Apertium stream format" labelled-corpus.conllu