nlproc
There are 55 repositories under nlproc topic.
huggingface/knockknock
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code
MIND-Lab/OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
jina-ai/agentchain
Chain together LLMs for reasoning & orchestrate multiple large models for accomplishing complex tasks
feralvam/easse
Easier Automatic Sentence Simplification Evaluation
KennethEnevoldsen/augmenty
Augmenty is an augmentation library based on spaCy for augmenting texts.
ahmedbesbes/media-agent
Scrape data from social media and chat with it using Langchain
majumderb/recipe-personalization
EMNLP 2019: Generating Personalized Recipes from Historical User Preferences
kasnerz/reffix
A tool for fixing a BibTeX reference list using DBLP API
UBC-NLP/turjuman
TURJUMAN, a neural toolkit for translating from 20 languages into Modern Standard Arabic (MSA).
StatguyUser/TextFeatureSelection
Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
HendrikStrobelt/LMdiff
A diff tool for language models
thunlp/HiddenKiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
ahmedbesbes/keywords-extractor-with-bert
A Streamlit app to extract keywords using KeyBert
yoseflaw/nerindo
Named Entity Recognition with BiLSTM, CRF, and Attention-based models implemented in PyTorch for Indonesian News.
ahmedbesbes/anonymizer
Text Anonymization app with Streamlit and Spacy
RichardLitt/thesis
My thesis on "Open Source Code and Low Resource Languages" for an MSc in Language Science and Technology at Saarland University
ahmedbesbes/multi-label-sentiment-classifier
How to build a multi-label sentiment classifiers with Tez and PyTorch
coastalcph/lexlms
LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development
thunlp/BkdAtk-LWS
Code and data of the ACL 2021 paper "Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution"
CharlyWargnier/S4_wiki_topic_grapher
Leverage the power of the Google Natural Language API NLP to retrieve entity relationships from Wikipedia URLs or topics! Get interactive networkx graphs of connected entities!
Yangyi-Chen/MAYA
Code base for the EMNLP 2021 paper, "Multi-granularity Textual Adversarial Attack with Behavior Cloning".
michelecafagna26/cider
Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming effects. We also add the possibility to replace the original PTBTokenizer with the Spacy tekenizer (No java dependincy but slower)
Lingwars/GAPLEN
Grupo de Aprendizaje de Procesamiento del Lenguaje Natural, lanzado por Lingwars
vgtomahawk/Charmanteau-CamReady
Code for "CharManteau: Character Embedding Models For Portmanteau Creation. EMNLP 2017. Varun Gangal*, Harsh Jhamtani*, Graham Neubig, Eduard Hovy, Eric Nyberg"
bubblspace/AIOne
MLOne Powered by AIEdX. Machine Learning Course for Everyone. Tier1 Basic
elleros/DSHealth2019_loinc_embeddings
Code and Word2Vec embeddings of LOINC codes for KDD 2019 DSHealth paper "Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center": https://arxiv.org/abs/1907.09600
gsarti/svevo-letters-analysis
Topic Modeling and Sentiment Analysis on Italo Svevo Epistolary Corpus
dsfsi/vukuzenzele-nlp
The dataset contains editions from the South African government magazine Vuk'uzenzele. Data was scraped from PDFs that have been placed in the data/raw folder. The PDFS were obtained from the Vuk'uzenzele website.
jmacwan/POSPair
Simplifying representation for Natural Language Processing
DemoVersion/nlp_common_codes
Some of My Codes for Natural Language Processing
dsfsi/gov-za-multilingual
The data set contains cabinet statements from the South African government. Data was scraped from the governments website: https://www.gov.za/cabinet-statements
dsfsi/PuoBERTa
A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.
diyclassics/latincy-book
An always-a-work-in-progress combination of documentation and demo notebooks for working with the LatinCy models
dsfsi/za-mavito
DSFSI South African Terminlogy Lists and Lexicon Project
jplasser/CNEP
CNEP (Contrastive Notes Events Pre-training), Contrastive Learning with Clinical Notes and Events Data Pre-training from MIMIC-III
mhmdsabry/BERT_with_Residual_vs_Highway
Comparing between residual stream and highway stream in transformers(BERT) .