Borja Navarro Colorado | University of Alicante
This INTELE webinar shows how to exploit the ELTeC corpus for literary studies with some examples. Except for the last one, these examples are implemented and explained in COLAB notebooks, so you can run them in your machine. They explore the next topics:
- how to open and process the ELTeC corpus with Python in COLAB;
- how to extract information annotated in XML;
- how to analyze the ELTeC corpus with basic NLP techniques;
- and finally a simple proposal to overcome language barriers.
-
Development version:
-
Stable versions:
- Official: https://zenodo.org/communities/eltec
- TEIpublisher: https://teipublisher.com/exist/apps/eltec/index.html
- GAMS: http://glossa.uni-graz.at/context:eltec
- TextGRID (testing) https://dev.textgridrep.org/browse/3thgt.0
- Extracting author and gender from one collection (ELTeC-SPA)
- Extracting author and gender from two (or more) collection (ELTeC-SPA and ELTeC-ENG)
- Extracting code switchig
Only an example about how to extract stylometric relations between novels from several languages. Unfortunatelly it is not possible to do it in COLAB.
Inter-lingual representation based on WordNet synsets. Stylometric relations extracted with R package "Stylo".
Some results: