bangla-nlp

There are 72 repositories under bangla-nlp topic.

  • sagorbrur/bnlp

    BNLP is a natural language processing toolkit for Bengali Language.

    Language:Jupyter Notebook28952464
  • csebuetnlp/banglabert

    This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.

    Language:Python2398832
  • csebuetnlp/banglanmt

    This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.

    Language:Python14791146
  • menon92/DL-Sneak-Peek

    Deep learning Bangla resources with TensorFlow

    Language:Jupyter Notebook14411075
  • csebuetnlp/BanglaNLG

    This repository contains the official release of the model "BanglaT5" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaNLG: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla".

    Language:Python834211
  • sagorbrur/bangla-bert

    Bangla-Bert is a pretrained bert model for Bengali language

    Language:Jupyter Notebook784123
  • lifeparticle/Bengali-Alphabet

    ✍️ Bengali Alphabet (বাংলা বর্ণমালা)

    Language:JavaScript7331218
  • bnlp-resources

    banglanlp/bnlp-resources

    Awesome datasets for Bangla language computing.

    Language:Python589121
  • menon92/BangalASR

    Transformer based Bangla Speech Recognition

    Language:Jupyter Notebook525316
  • menon92/BanglaTranslator

    Bangla Machine Translator

    Language:Python443014
  • Sigmakib2/Nirmol

    Nirmol is an open-source dataset and API for detecting Bangla slang words. Detect offensive/bad/slang words in Bangla/Bengali/Banglish sentences. A helpful API and dataset for developers and researchers.

    Language:JavaScript36113
  • asraf-patoary/bnltk

    BNLTK(Bangla Natural Language Processing Toolkit): a python package for NLP in Bangla

    Language:Python22158
  • zabir-nabil/bangla-news-rnn

    Bangla news classification and generation

    Language:Jupyter Notebook22009
  • MahirMahbub/Contextual-Spell-Checker-For-Bangla

    Automatic Context Sensitive Spelling Correction for Bangla Text Using Bert and Levenstein Distance

    Language:Python20325
  • csebuetnlp/banglaparaphrase

    This repository contains the code, data, and associated models of the paper titled "BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset", accepted in Proceedings of the Asia-Pacific Chapter of the Association for Computational Linguistics: AACL 2022.

    Language:Python15101
  • KSMubasshir/bd-newspaper-crawlers

    A collection of Bangla newspaper and blog crawlers. Can be used to mine bangla text data for Natural Language Processing tasks.

    Language:Python15217
  • menon92/Bangla-Word2Vec

    Bangla word2vec using skipgram approach

    Language:Jupyter Notebook15200
  • Foysal87/bn_nlp

    Bangla NLP toolkit.

    Language:Python121211
  • Ayubur/bangla-sentiment-analysis-datasets

    Different bangla datasets for sentiment analysis on bangla text

  • Botbang/avro-bangla-autocorrect-dictionary-enriched

    The default auto correct dictionary added in avro Bangla keyboard doesn't contain enough word. So, this is my approach to enrich the dictionary. This file contains the correct spelling of commonly used Bangla words.

  • aparnadutta/code-mixed-lid

    Word-level language identification for Bangla-English code-mixed social media data, using a BiLSTM with subword embeddings.

    Language:Python9201
  • AsifulNobel/Metsys

    Chatbot Solution for Resource-Poor Languages. Contains code and data for Journal Article 'Focused domain contextual AI chatbot framework for resource poor languages'.

    Language:Python9206
  • asiff00/bangla-pdf-ocr

    Bangla PDF to text converter that works on Windows, macOS, and Linux without any extra downloads or configurations.

    Language:Python8102
  • Foysal87/NLP-colab-trainer

    A collection of colab trainer for NLP tasks.

  • MISabic/NER-Bangla-Dataset

    Dataset for Bangla named entity recognition

  • ayan-cs/bangla-ocr-transformer

    Implementation of the paper 'Towards Full page Offline Bangla Handwritten Text Recognition using Image-to-Sequence Architecture'. For details, please read the README section.

    Language:Python6102
  • csebuetnlp/BanglaSocialBias

    This is the official repository containing all codes used to generate the results reported in the paper titled "Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias"

    Language:Jupyter Notebook6102
  • Bengali-News-Summarization-BanglaT5

    ambideXtrous9/Bengali-News-Summarization-BanglaT5

    Bengali News Summarization - BengaliGPT & T5

    Language:Python5100
  • BengaliAI/bengaliAnalyzer

    This module helps to analyze Bengali sentences. It can analyze various entities. Can do non contextual PoS tagging. Is capable of returning the lemmas present in a sentence.

    Language:Jupyter Notebook5354
  • Kabir5296/banglanlptoolkit

    Bangla NLP toolkit: Bangla text normalization, punctuation generation and augmentation for Bangla NLP tasks. This project is available on PyPi as well.

    Language:Python5104
  • rafayetrafi/BanglaMusicStylo-A-Stylometric-Dataset-of-Bangla-Music-Lyrics

    With the rapid growth of Bangla music industry huge volume of Bangla songs are produced every day. Immense number of producers, lyricists, singers and artists are involved in production of songs from different genres. Among many genres of Bangla music; classical, folk, baul, modern music, Rabindra Sangeet, Nazrul Geeti, film music, rock music and fusion music has gained the highest popularity. Lyricists try to express their feelings and views towards any situation or subject through their writings. Therefore, each lyricist have their own dictionary of thoughts to put on music lyrics. In this paper, we have presented “BanglaMusicStylo”, the very first stylometric dataset of Bangla music lyrics. We have collected 2824 Bangla song lyrics of 211 lyricists in a digital form. All the lyrics are stored in text format for further use. This dataset could be used for stylometric analysis such as authorship attribution, linguistic forensics, gender identification from textual data, Bangla music genre classification, vandalism detection, emotion classification etc. Identifying the significant research opportunities in this area, we have formalized this dataset which could be used for stylometric analysis.

  • Mufassir-Chowdhury/BnPC

    This is the official repository of the paper titled "BnPC: A Gold Standard Corpus for Paraphrase Detection in Bangla, and its Evaluation", accepted in The 17th Workshop on Building and Using Comparable Corpora (BUCC 2024) co-located with LREC-COLING 2024. It contains the codes and the dataset.

    Language:Jupyter Notebook4100
  • ShawonAshraf/bangla-nlp-tutorial

    বাংলায় ন্যাচারাল ল্যাঙ্গুয়েজ প্রসেসিং এর উপর লেখা সিরিজের জন্য কোড রিপোজিটরি

    Language:Jupyter Notebook4100
  • alvi-khan/Bangla-HealthNER

    The data and code of 'NERvous About My Health: Constructing a Bengali Medical Named Entity Recognition Dataset', published in the Findings of the Association for Computational Linguistics, EMNLP 2023.

    Language:Python3201
  • csebuetnlp/BanglaContextualBias

    This is the official repository containing all codes used to generate the results reported in the paper titled "An Empirical Study on the Characteristics of Bias upon Context Length Variation for Bangla" accepted in Findings of the Association for Computational Linguistics: ACL 2024

    Language:Jupyter Notebook3102