/TurkishSpellChecker

Estimating Correct Writing of "de/da" and "ki" Clitics' for Turkish: Conjunction or Locative Suffixes

Primary LanguageJupyter NotebookMIT LicenseMIT

Estimating Correct Writing of "de/da" and "ki" Clitics' for Turkish: Conjunction or Locative Suffixes

teaser
Open In Colab

๐ŸŒŸ Project Description

The โ€œde/daโ€ clitic in Turkish is a conjunction when it is written separately and has the same meaning as "as well", "too", and "also" in English. In addition to being a conjunction, the โ€œdeโ€ and โ€œdaโ€ homonyms may be used as locative suffixes meaning โ€œatโ€ or โ€œinโ€. For example, the word โ€œarabaโ€ (car) with the suffix โ€œ-daโ€ (โ€œarabadaโ€) means โ€œin the carโ€. Although the โ€œde/daโ€ clitic in the meaning of conjunction must always be written separately, it is commonly confused with the locative suffix "de/da" and incorrectly written concatenated to the previous word. This project focuses on a common spelling error in Turkish, namely the spelling of the โ€œde/daโ€ and "ki" clitics.

Detailed explanation about the project is in this document: Document

๐Ÿ“œ Installation

git clone https://github.com/asumansaree/TurkishSpellChecker
cd TurkishSpellChecker

๐Ÿค— Usage

For "de/da" separation testing

cd Data/

Edit the test_sentences_de.txt (or other text_sentences for other models) and save

cd ../Test
python3 test_for_de_separation.py

๐Ÿš€ Sample Output

Output will be printed to the terminal (or if you're using Colab, you'll see it directly) sample_output

๐Ÿ’ฌ Contact

Contact me for any problem and question asumansaree@gmail.com

๐Ÿ™ References

  • "Detecting Clitics Related Orthographic Errors in Turkish", Proceedings of Recent Advances in Natural Language Processing, pages 71โ€“76, Varna, Bulgaria, Sep 2โ€“4, 2019.