Natural Language Processing

This repository houses Natural Language Processing (NLP) projects that I have completed (other than projects completed using Spark & Databricks).

HuggingFace Profile

To view and use the models, travel over my HuggingFace portfolio: huggingface.co/DunnBC22

Text Classification

Multiclass Classification

Project Name	Model Checkpoint	Accuracy	Macro F1 Score	Macro Precision	Macro Recall
Apple iPhone SE Reviews*	`bert-base-uncased`	0.9712	0.9561	0.9538	0.9598
Apple iPhone SE Reviews*	`microsoft/mpnet-base`	0.9460	0.7242	0.7007	0.7594
CNN News Articles	`distilbert-base-uncased`	0.9643	0.9640	-	-
Hate & Offensive Speech*	`bert-base-uncased`	0.9213	0.9161	0.9241	0.9144
Hate & Offensive Speech*	`bert-large-uncased`	0.9869	0.9863	0.987	0.9857
Hate & Offensive Speech*	`distilbert-base-uncased`	0.9607	0.9592	0.9613	0.9579
Hate & Offensive Speech*	`diptanu/fBERT`	0.9607	0.9581	0.9596	0.9571
Hate & Offensive Speech*	`GroNLP/hateBERT`	0.941	0.9351	0.951	0.9273
Password Strength	`microsoft/codebert-base`	0.9975	0.9963	0.9948	0.9978
Malicious URLs	`microsoft/codebert-base`	0.7279	0.4611	0.5436	0.4422
Malicious URLs	`microsoft/codebert-base-mlm`	0.7322	0.4303	0.6034	0.4233
Malicious URLs	`microsoft/deberta-base-mnli`	0.7353	0.4533	0.5684	0.4315
Malicious URLs (Using PEFT)	`roberta-large`	0.7160	0.4374	0.5237	0.4190
Malicious URLs	`albert-base-v2`	0.7267	0.4521	0.5508	0.4294

Binary Classification

Project Name	Transformer Checkpoint	Accuracy	F1 Score	Precision	Recall
Malignant Comments - BERT-Base*	`bert-base-uncased`	0.972	0.759	0.6918	0.8406
Malignant Comments - I-BERT*	`kssteven/ibert-roberta-base`	0.9741	0.7773	0.7084	0.861
Mental Health Classification	`google/canine-c`	0.9226	0.9096	0.9113	0.9079
OnionOrNot	`distilbert-base-uncased`	0.9224	0.9218	-	-
Spam Filter (Larger Dataset)	`distilbert-base-uncased`	0.9845	0.9848	-	-
Spam Filter (Smaller Dataset)	`distilbert-base-uncased`	0.9907	0.9906	-	-
Tweet About a Disaster or Not? - ALBERT*	`albert-base-v2`	0.9138	0.7752	0.8204	0.7348
Tweet About a Disaster or Not? - DeBERTa*	`microsoft/deberta-v3-small`	0.9050	0.7453	0.7453	0.7453
Tweet About a Disaster or Not? - DistilBERT*	`distilbert-base-uncased`	0.9138	0.7752	0.8204	0.7348
Tweet About a Disaster or Not? - ERNIE*	`nghuyong/ernie-2.0-base-en`	0.9156	0.7876	0.8436	0.7386
Tweet About a Disaster or Not? - ELECTRA*	`bhadresh-savani/electra-base-emotion`	0.8857	0.7246	0.7991	0.6628
Tweet About a Disaster or Not? - RoBERTa*	`roberta-base`	0.8989	0.7569	0.8211	0.7020

Multilabel Classification

Project Name	Model Checkpoint	Subset Accuracy	F1 Score	ROC-AUC
Go Emotions	`distilbert-base-uncased`	0.2184	0.3328	0.6102
Research Articles	`distilbert-base-uncased`	0.6977	0.8395	0.8909
Review Sentiments (with DistilBert)	`distilbert-base-uncased`	0.5787	0.8697	0.9107
Review Sentiments (with Bert)	`bert-base-uncased`	0.5967	0.8737	0.9146

Token Classification

Project Name	Overall Accuracy	Overall F1 Score	Overall Precision	Overall Recall	Multilingual?
Babelscape WikiNeural Joined Dataset	0.994704	0.995886	0.995711	0.996060	Yes
BC2GM-IOB (EMBO-BLURB)	0.9736	0.7765	0.7521	0.8025	No
EMBO-BLURB with LoRA	0.9584	0.8136	0.7999	0.8278	No
DFKI-SLT/few-nerd	0.9498	0.8041	0.8203	0.7886	No
NCBI Disease	0.9825	0.8359	0.8064	0.8677	No
TNER Bio NLP 2004	0.9367	0.7169	0.6628	0.7805	No
Stromberg NLP - Twitter (SeqEval)	0.9860	0.9824	0.9828	0.9820	No
Stromberg NLP - Twitter PoS_v2	0.9853	0.8931	0.9296	0.8931	No
Stromberg NLP - Twitter PoS (SqueezeBERT Transformer)	0.9771	0.7765	0.8046	0.7785	No
WikiNeural - BERT-Base	0.9912	0.9145	0.9380	0.9261	No
WikiNeural - Amazon's BORT	0.9709	0.7050	0.7868	0.7437	No
WikiNeural - FNet-Base	0.8521	0.8934	0.8722	0.9853	No
WikiNeural - Funnel Transformer	0.9856	0.8722	0.9102	0.8908	No
WikiNeural - I-BERT-Base	0.9909	0.9107	0.9360	0.9232	No
WikiNeural - MEGA-Base	0.9619	0.6312	0.7324	0.6781	No
WikiNeural - RoBERTa-Base	0.9910	0.9124	0.9352	0.9237	No
WikiNeural - SqueezeBERT	0.9803	0.8278	0.8866	0.8562	No
WikiNeural - XLNet-Base	0.9904	0.9068	0.9324	0.9194	No

Sentiment Analysis

Project Name	Model Checkpoint	Accuracy	Macro F1 Score	Macro Precision	Macro Recall
Emotions Sentiment Analysis	`distilbert-base-uncased`	0.935	0.935	-	-
Financial Sentiment Analysis - Original	`distilbert-base-uncased`	0.8425	0.8470	-	-
Financial Sentiment Analysis - Updated (v1.5)	`distilbert-base-uncased`	0.8529	0.8564	-	-
Financial Sentiment Analysis_v2	`google/fnet-base`	0.8117	0.7472	0.7588	0.7394
Financial Sentiment Analysis_v3	`google/fnet-large`	0.8618	0.8209	0.8084	0.8401
News About Gold - BORT*	`amazon/bort`	0.8770	0.7791	0.8463	0.7539
News About Gold - BERT-Base*	`bert-base-uncased`	0.9139	0.8758	0.8885	0.8647
News About Gold - Funnel*	`funnel-transformer/medium-base`	0.9172	0.8854	0.8853	0.8859
News About Gold - MEGA*	`mnaylor/mega-base-wikitext`	0.5014	0.3283	0.4548	0.3835
News About Gold - MPNet-Base*	`microsoft/mpnet-base`	0.9068	0.8351	0.831	0.8406
News About Gold - SqueezeBERT*	`squeezebert/squeezebert-uncased`	0.9168	0.8749	0.8822	0.8684
News About Gold - YOSO*	`uw-madison/yoso-4096`	0.4456	0.2272	0.3240	0.2912
Twitter Sentiment Analysis	`distilbert-base-uncased`	0.8466	0.8471	-	-
Twitter Sentiment Analysis_v2	`bert-base-uncased`	0.8474	0.788	0.8132	0.7747
Twitter Sentiment Analysis_v3	`vinai/bertweet-base`	0.8588	0.8151	0.8463	0.7961

Metrics are Macro Averaged version only if all 4 metric values are displayed (accuracy, f1-score, recall, and precision).

Language Detection

Project Name	Accuracy	Macro F1 Score	Macro Precision	Macro Recall
Language Detection of Tweets	0.9992	0.9992	0.9992	0.9992
Language Detection- 10k	0.9971	0.9977	0.9981	0.9974
Language Detection-20k	0.9883	0.9882	0.9887	0.9879

Semantic Similarity

Project Name	Accuracy	F1 Score	Precision	Recall	Average Precision
Semantic Similarity of Quora Pairs Dataset - Base	85.93	82.89	77.43	89.18	87.13
Semantic Similarity of Quora Pairs Dataset - Large	88.72	85.22	80.72	90.25	89.75

Metrics shown for Semantic Similarity are measured using Cosine-Similarity.

Text Generation/Multiple Choice/Question&Answer

Text Summarization

Project Name	Rouge1	Rouge2	RougeL	RougeLsum
Flan-T5 - Text Summarization-Data Dataset (1 Epoch)	43.6615	20.349	40.1032	40.1589
Flan-T5 - Text Summarization-Data Dataset (6 Epochs)	43.5994	0.4446	40.132	40.1692
LED - Text Summarization-Data Dataset (4 Epochs)	43.3689	19.9885	39.9887	40.0679
CNN News Text Summarization	0.834343	0.793822	0.823824	0.823778
Text Summarization BBC News (with Pegasus Transformer)	0.584474	0.463574	0.408729	0.408431

Machine Translation

Project Name	Transformer Checkpoint	Bleu	Rouge1	Rouge2	RougeL	RougeLsum	Meteor
English to French	`facebook/mbart-large-50`	35.1914	0.6420	0.4573	0.6070	0.6069	0.5917
English to German	`facebook/mbart-large-50`	35.5931	0.5803	0.3939	0.5439	0.5442	0.55
English to Spanish	`facebook/mbart-large-50`	41.4437	0.6751	0.4977	0.6372	0.6376	0.6479
BioMedical EN to IT Translation	`facebook/mbart-large-50`	38.9893	0.6826	0.4737	0.6586	0.6585	0.6270
Chinese to English Translation	`Helsinki-NLP/opus-mt-zh-en`	45.2808	0.6201	0.4198	0.5927	0.5927	-
Korean to English	`Helsinki-NLP/opus-mt-ko-en`	14.3395	0.4391	0.2022	0.3671	0.3671	-
Medical - German to English	`Helsinki-NLP/opus-mt-de-en`	53.8812	0.7664	0.6284	0.7370	0.7370	-

Question & Answer

Project Name	Exact Match	F1 Score
ML QA	59.6146	73.3002
Answer Prediction Dataset	65.7357	79.2835

Generate Docstrings

Project Name	Model Checkpoint	Rouge1	Rouge2	RougeL	RougeLsum
CodeSearchNet Dataset to Generate Docstrings (Code T5 Project)	`Salesforce/codet5-small`	0.3381	0.1541	0.3045	0.3214
Smol Dataset to Generate Docstrings	`Salesforce/codet5-base`	0.4947	0.3661	0.4794	0.4791
Smol Dataset to Generate Docstrings	`Salesforce/codet5-small`	0.38	0.2176	0.3554	0.3635

Multiple Choice

Project Name	Accuracy
CosmosQA	0.6000
Social IQa	0.6128
Discourse Marker QA	0.6207
Figurative Language	0.8124
Strategy QA	0.625
e-CARE	0.7212
Vitamin C Fact Verification	0.7240
Winowhy	0.7118

NLP Regression

Project Name	Mean Squared Error (MSE)	Root Mean Squared Error (RMSE)	Mean Absolute Error (MAE)
Edmunds Car Reviews - All Brands (with Bert-Base)	0.2324	0.4820	0.3089
Edmunds Car Reviews - All Brands	0.2232	0.4724	0.3150
Edmunds Car Reviews - Brands Headquartered in America	0.2486	0.4986	0.3469
Edmunds Car Reviews - Brands Headquartered in Europe	0.1999	0.4471	0.2824
Edmunds Car Reviews - Brands Not Headquartered in America or Europe	0.2240	0.4733	0.3140
Episode Reviews/Rating - The Simpsons	0.7632	0.8736	0.6622
Episode Reviews/Rating - The Simpsons & Other TV Shows	0.3754	0.6127	0.4651
TMDB 5000 Move Dataset	0.7613	0.8725	0.6848

Language Modeling

Causal Language Modeling

Project Name	Perplexity
2000 Clean Medical Articles	18.67
AG News (DistilGPT2 Version)	31.53
AG News (GPT2 Version)	22.92
US Economic News Articles	31.41

Causal Language Modeling for Chatbot

Project Name	Perplexity
Large Company's FAQs (Medium) v1	8.67
Large Company's FAQs (Large) v1	2.79
Large Company's FAQs v2	1.70

Masked Language Modeling

Project Name	Perplexity
AG News	5.95
Reddit Comments	12.70
US Economic News Articles	6.25

Footnotes: