- Tokenization
- Sentence Segmentation
- Sentence boundary detection
- Normalization --Character normalization --Abbreviation substitution --Strange char, word, symbole remove -- Removal of emojis -- Removal of emoticons -- Removal of Punctuations -- Conversion of emoticons to words -- Conversion of emojis to words
- StopWord Removal
- Noun phrase chunking
- Lemmatization
- Stemming
- Named Entity Recognition
- POS Tagging
- Word-sense disambiguation • Co-reference resolutiony • Entity linking • Terminology extraction • Discourse parsing
Bushra-KB/Amharic-NLP-Tools-in-JAVA
This repository contains implementations of various Natural Language Processing (NLP) tasks and tools specifically for the Amharic language using Java. The goal is to provide a comprehensive set of tools to facilitate NLP research and development for Amharic.
JavaMIT