/awesome-kurdish

A curated list of awesome resources and tools for Kurdish language technology

Primary LanguageTeXOtherNOASSERTION

Awesome Kurdish

(last updated on 12/06/2021)

A curated list of awesome resources, tools and scientific papers for Kurdish language technology

Although I do my best to keep this page as comprehensive as possible by including all projects, the list may not include all the fantastic small and big projects regarding Kurdish language processing. Please be kind and notify me by reaching out by email or through our community on Gitter.

Are you interested in contributing to Kurdish language processing? Check out this post to see how you can do so.

Development

Resources

Corpora

Parallel corpora

Dictionaries, terminologies and ontologies

Check out a comprehensive list of Kurdish dictionaries and beware of copyright issues in the following projects:

Datasets

Other resources

Word Embeddings:

Tools

Fundamental processing

Machine translation

Named-entity recognition

Libraries

Other

In addition to these, you can find further information in other repositories and pages as follows:

Research

These references are provided based on the data collected in the paper entitled KLPT – Kurdish Language Processing Toolkit. Note that references are provided in the bibliography file.

Reference Year Field dialects
esmaili2013sorani 2013 Dialectology Sorani, Kurmanji
hassani2016automatic 2016 Dialectology Sorani, Kurmanji
malmasi2016subdialectal 2016 Dialectology Sorani
al2017kurdish 2017 Dialectology Sorani, Kurmanji, Gorani
amani:hal-03262435 2021 Dialectology Kurdish, Zazaki & Gorani
mohammed2012automatic 2012 Information retrieval and Text mining Sorani
esmaili2012challenges 2012 Information retrieval and Text mining Sorani
littell2016named 2016 Information retrieval and Text mining Sorani
hassani2017method 2017 Information retrieval and Text mining Sorani, Kurmanji
esmaAl-Talabaniili2014towards 2014 Information retrieval and Text mining Sorani, Kurmanji
jaf2016simple 2016 Information retrieval and Text mining Sorani
rashid2017robust 2017 Information retrieval and Text mining Sorani
rashid2017automatic 2017 Information retrieval and Text mining Sorani
saeed2018improving 2018 Information retrieval and Text mining Sorani
saeed2018improving 2018 Information retrieval and Text mining Sorani
mustafa2018kurdish 2018 Information retrieval and Text mining Sorani
saeed2018evaluation 2018 Information retrieval and Text mining Sorani
ahmadi2019wergor 2019 Information retrieval and Text mining Sorani
mahmudi2021automated 2021 Information retrieval and Text mining Sorani
esmaili2013building 2013 Lexical resources Sorani
aliabadi2014towards 2014 Lexical resources Sorani
aliabadi2014semi 2014 Lexical resources Sorani
ataman2018bianet 2018 Lexical resources Kurmanji
ahmadi2019towards 2019 Lexical resources Sorani, Kurmanji, Gorani
abdulrahman2019developing 2019 Lexical resources Sorani
abdulrahman2020using 2020 Lexical resources Sorani
veisi2020toward 2020 Lexical resources Sorani
ahmadi2020corpus 2020 Lexical resources Sorani
ahmadi-2020-building 2020 Lexical resources Zaza, Gorani
ahmadi2020leveraging 2020 Lexical resources Sorani
veisi2021jira 2021 Lexical resources Sorani
hassani2017kurdish 2017 Machine Translation Sorani, Kurmanji
kaka2018english 2018 Machine Translation Sorani
ahmadi2020machine 2020 Machine Translation Sorani
goyal2021flores 2021 Machine Translation 101 languages incl. Sorani
amini2021central 2021 Machine Translation Sorani
baban1995programmable 1995 Morphological and syntactic analysis Sorani
walther2010developing 2010 Morphological and syntactic analysis Sorani
walther2010fast 2010 Morphological and syntactic analysis Kurmanji
salavati2013stemming 2013 Morphological and syntactic analysis Sorani
jaf2014stemmer 2014 Morphological and syntactic analysis Sorani
jaf2016chapter 2016 Morphological and syntactic analysis Sorani
gokirmak2017dependency 2017 Morphological and syntactic analysis Kurmanji
salavati2018building 2018 Morphological and syntactic analysis Sorani
mustafa2018kurdish 2018 Morphological and syntactic analysis Sorani
ahmadi2020towards 2020 Morphological and syntactic analysis Sorani
ahmadi-2020-tokenization 2020 Morphological and syntactic analysis Sorani, Kurmanji
mohammed2012uniqueness 2012 Optical character recognition Sorani
mohammed2013handwritten 2013 Optical character recognition Sorani
shaltookisentiment 2016 Optical character recognition Sorani
zarro2017recognition 2017 Optical character recognition Sorani
yaseen2018kurdish 2018 Optical character recognition Sorani
dinler2018kurdish 2018 Optical character recognition Sorani
kaka2017building 2017 Other Sorani
mahmudi2021automatic 2021 Other Sorani
hashim2018kurdish 2018 Sign language recognition Sorani
kamal-hassani-2020-towards 2020 Sign language recognition Sorani
daneshfar2009implementation 2009 Speech recognition Sorani
barkhoda2009comparison 2009 Speech recognition Sorani
bahrampour2009implementation 2009 Speech recognition Sorani
hassani2011kurdish 2011 Speech recognition Sorani
dinler2017formant 2017 Speech recognition Kurmanji
dinler2018extraction 2018 Speech recognition Sorani, Kurmanji
qader2019kurdish 2019 Speech recognition Sorani
ahmadi-2020-klpt 2020 Toolkits Sorani, Kurmanji
de2021multilingual 2021 Named-entity recognition Kurmanji

Cite this repository

If you find the provided data useful for your project, feel free to use it and please, cite the following paper, too:

@inproceedings{ahmadi-2020-klpt,
    title = "{KLPT} {--} {K}urdish Language Processing Toolkit",
    author = "Ahmadi, Sina",
    booktitle = "Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.nlposs-1.11",
    doi = "10.18653/v1/2020.nlposs-1.11",
    pages = "72--84"
}