/Amharic-NLP-Tools-in-JAVA

This repository contains implementations of various Natural Language Processing (NLP) tasks and tools specifically for the Amharic language using Java. The goal is to provide a comprehensive set of tools to facilitate NLP research and development for Amharic.

Primary LanguageJavaMIT LicenseMIT

Amharic-NLP-Tools-in-JAVA

Lists of Amharic Text Preprocessing Tasks:

  • Tokenization
  • Sentence Segmentation
  • Sentence boundary detection
  • Normalization --Character normalization --Abbreviation substitution --Strange char, word, symbole remove -- Removal of emojis -- Removal of emoticons -- Removal of Punctuations -- Conversion of emoticons to words -- Conversion of emojis to words
  • StopWord Removal
  • Noun phrase chunking
  • Lemmatization
  • Stemming
  • Named Entity Recognition
  • POS Tagging
  • Word-sense disambiguation • Co-reference resolutiony • Entity linking • Terminology extraction • Discourse parsing