The โde/daโ clitic in Turkish is a conjunction when it is written separately and has the same meaning as "as well", "too", and "also" in English. In addition to being a conjunction, the โdeโ and โdaโ homonyms may be used as locative suffixes meaning โatโ or โinโ. For example, the word โarabaโ (car) with the suffix โ-daโ (โarabadaโ) means โin the carโ. Although the โde/daโ clitic in the meaning of conjunction must always be written separately, it is commonly confused with the locative suffix "de/da" and incorrectly written concatenated to the previous word. This project focuses on a common spelling error in Turkish, namely the spelling of the โde/daโ and "ki" clitics.
Detailed explanation about the project is in this document: Document
git clone https://github.com/asumansaree/TurkishSpellChecker
cd TurkishSpellChecker
For "de/da" separation testing
cd Data/
Edit the test_sentences_de.txt (or other text_sentences for other models) and save
cd ../Test
python3 test_for_de_separation.py
Output will be printed to the terminal (or if you're using Colab, you'll see it directly)
Contact me for any problem and question asumansaree@gmail.com
- "Detecting Clitics Related Orthographic Errors in Turkish", Proceedings of Recent Advances in Natural Language Processing, pages 71โ76, Varna, Bulgaria, Sep 2โ4, 2019.