This repository houses Natural Language Processing (NLP) projects that I have completed (other than projects completed using Spark & Databricks).
To view and use the models, travel over my HuggingFace portfolio: huggingface.co/DunnBC22
Multiclass Classification
Project Name | Model Checkpoint | Accuracy | Macro F1 Score | Macro Precision | Macro Recall |
---|---|---|---|---|---|
Apple iPhone SE Reviews* | bert-base-uncased |
0.9712 | 0.9561 | 0.9538 | 0.9598 |
Apple iPhone SE Reviews* | microsoft/mpnet-base |
0.9460 | 0.7242 | 0.7007 | 0.7594 |
CNN News Articles | distilbert-base-uncased |
0.9643 | 0.9640 | - | - |
Hate & Offensive Speech* | bert-base-uncased |
0.9213 | 0.9161 | 0.9241 | 0.9144 |
Hate & Offensive Speech* | bert-large-uncased |
0.9869 | 0.9863 | 0.987 | 0.9857 |
Hate & Offensive Speech* | distilbert-base-uncased |
0.9607 | 0.9592 | 0.9613 | 0.9579 |
Hate & Offensive Speech* | diptanu/fBERT |
0.9607 | 0.9581 | 0.9596 | 0.9571 |
Hate & Offensive Speech* | GroNLP/hateBERT |
0.941 | 0.9351 | 0.951 | 0.9273 |
Password Strength | microsoft/codebert-base |
0.9975 | 0.9963 | 0.9948 | 0.9978 |
Malicious URLs | microsoft/codebert-base |
0.7279 | 0.4611 | 0.5436 | 0.4422 |
Malicious URLs | microsoft/codebert-base-mlm |
0.7322 | 0.4303 | 0.6034 | 0.4233 |
Malicious URLs | microsoft/deberta-base-mnli |
0.7353 | 0.4533 | 0.5684 | 0.4315 |
Malicious URLs (Using PEFT) | roberta-large |
0.7160 | 0.4374 | 0.5237 | 0.4190 |
Malicious URLs | albert-base-v2 |
0.7267 | 0.4521 | 0.5508 | 0.4294 |
Binary Classification
Project Name | Transformer Checkpoint | Accuracy | F1 Score | Precision | Recall |
---|---|---|---|---|---|
Malignant Comments - BERT-Base* | bert-base-uncased |
0.972 | 0.759 | 0.6918 | 0.8406 |
Malignant Comments - I-BERT* | kssteven/ibert-roberta-base |
0.9741 | 0.7773 | 0.7084 | 0.861 |
Mental Health Classification | google/canine-c |
0.9226 | 0.9096 | 0.9113 | 0.9079 |
OnionOrNot | distilbert-base-uncased |
0.9224 | 0.9218 | - | - |
Spam Filter (Larger Dataset) | distilbert-base-uncased |
0.9845 | 0.9848 | - | - |
Spam Filter (Smaller Dataset) | distilbert-base-uncased |
0.9907 | 0.9906 | - | - |
Tweet About a Disaster or Not? - ALBERT* | albert-base-v2 |
0.9138 | 0.7752 | 0.8204 | 0.7348 |
Tweet About a Disaster or Not? - DeBERTa* | microsoft/deberta-v3-small |
0.9050 | 0.7453 | 0.7453 | 0.7453 |
Tweet About a Disaster or Not? - DistilBERT* | distilbert-base-uncased |
0.9138 | 0.7752 | 0.8204 | 0.7348 |
Tweet About a Disaster or Not? - ERNIE* | nghuyong/ernie-2.0-base-en |
0.9156 | 0.7876 | 0.8436 | 0.7386 |
Tweet About a Disaster or Not? - ELECTRA* | bhadresh-savani/electra-base-emotion |
0.8857 | 0.7246 | 0.7991 | 0.6628 |
Tweet About a Disaster or Not? - RoBERTa* | roberta-base |
0.8989 | 0.7569 | 0.8211 | 0.7020 |
Multilabel Classification
Project Name | Model Checkpoint | Subset Accuracy | F1 Score | ROC-AUC |
---|---|---|---|---|
Go Emotions | distilbert-base-uncased |
0.2184 | 0.3328 | 0.6102 |
Research Articles | distilbert-base-uncased |
0.6977 | 0.8395 | 0.8909 |
Review Sentiments (with DistilBert) | distilbert-base-uncased |
0.5787 | 0.8697 | 0.9107 |
Review Sentiments (with Bert) | bert-base-uncased |
0.5967 | 0.8737 | 0.9146 |
Token Classification
Project Name | Overall Accuracy | Overall F1 Score | Overall Precision | Overall Recall | Multilingual? |
---|---|---|---|---|---|
Babelscape WikiNeural Joined Dataset | 0.994704 | 0.995886 | 0.995711 | 0.996060 | Yes |
BC2GM-IOB (EMBO-BLURB) | 0.9736 | 0.7765 | 0.7521 | 0.8025 | No |
EMBO-BLURB with LoRA | 0.9584 | 0.8136 | 0.7999 | 0.8278 | No |
DFKI-SLT/few-nerd | 0.9498 | 0.8041 | 0.8203 | 0.7886 | No |
NCBI Disease | 0.9825 | 0.8359 | 0.8064 | 0.8677 | No |
TNER Bio NLP 2004 | 0.9367 | 0.7169 | 0.6628 | 0.7805 | No |
Stromberg NLP - Twitter (SeqEval) | 0.9860 | 0.9824 | 0.9828 | 0.9820 | No |
Stromberg NLP - Twitter PoS_v2 | 0.9853 | 0.8931 | 0.9296 | 0.8931 | No |
Stromberg NLP - Twitter PoS (SqueezeBERT Transformer) | 0.9771 | 0.7765 | 0.8046 | 0.7785 | No |
WikiNeural - BERT-Base | 0.9912 | 0.9145 | 0.9380 | 0.9261 | No |
WikiNeural - Amazon's BORT | 0.9709 | 0.7050 | 0.7868 | 0.7437 | No |
WikiNeural - FNet-Base | 0.8521 | 0.8934 | 0.8722 | 0.9853 | No |
WikiNeural - Funnel Transformer | 0.9856 | 0.8722 | 0.9102 | 0.8908 | No |
WikiNeural - I-BERT-Base | 0.9909 | 0.9107 | 0.9360 | 0.9232 | No |
WikiNeural - MEGA-Base | 0.9619 | 0.6312 | 0.7324 | 0.6781 | No |
WikiNeural - RoBERTa-Base | 0.9910 | 0.9124 | 0.9352 | 0.9237 | No |
WikiNeural - SqueezeBERT | 0.9803 | 0.8278 | 0.8866 | 0.8562 | No |
WikiNeural - XLNet-Base | 0.9904 | 0.9068 | 0.9324 | 0.9194 | No |
Sentiment Analysis
Project Name | Model Checkpoint | Accuracy | Macro F1 Score | Macro Precision | Macro Recall |
---|---|---|---|---|---|
Emotions Sentiment Analysis | distilbert-base-uncased |
0.935 | 0.935 | - | - |
Financial Sentiment Analysis - Original | distilbert-base-uncased |
0.8425 | 0.8470 | - | - |
Financial Sentiment Analysis - Updated (v1.5) | distilbert-base-uncased |
0.8529 | 0.8564 | - | - |
Financial Sentiment Analysis_v2 | google/fnet-base |
0.8117 | 0.7472 | 0.7588 | 0.7394 |
Financial Sentiment Analysis_v3 | google/fnet-large |
0.8618 | 0.8209 | 0.8084 | 0.8401 |
News About Gold - BORT* | amazon/bort |
0.8770 | 0.7791 | 0.8463 | 0.7539 |
News About Gold - BERT-Base* | bert-base-uncased |
0.9139 | 0.8758 | 0.8885 | 0.8647 |
News About Gold - Funnel* | funnel-transformer/medium-base |
0.9172 | 0.8854 | 0.8853 | 0.8859 |
News About Gold - MEGA* | mnaylor/mega-base-wikitext |
0.5014 | 0.3283 | 0.4548 | 0.3835 |
News About Gold - MPNet-Base* | microsoft/mpnet-base |
0.9068 | 0.8351 | 0.831 | 0.8406 |
News About Gold - SqueezeBERT* | squeezebert/squeezebert-uncased |
0.9168 | 0.8749 | 0.8822 | 0.8684 |
News About Gold - YOSO* | uw-madison/yoso-4096 |
0.4456 | 0.2272 | 0.3240 | 0.2912 |
Twitter Sentiment Analysis | distilbert-base-uncased |
0.8466 | 0.8471 | - | - |
Twitter Sentiment Analysis_v2 | bert-base-uncased |
0.8474 | 0.788 | 0.8132 | 0.7747 |
Twitter Sentiment Analysis_v3 | vinai/bertweet-base |
0.8588 | 0.8151 | 0.8463 | 0.7961 |
- Metrics are Macro Averaged version only if all 4 metric values are displayed (accuracy, f1-score, recall, and precision).
Language Detection
Project Name | Accuracy | Macro F1 Score | Macro Precision | Macro Recall |
---|---|---|---|---|
Language Detection of Tweets | 0.9992 | 0.9992 | 0.9992 | 0.9992 |
Language Detection- 10k | 0.9971 | 0.9977 | 0.9981 | 0.9974 |
Language Detection-20k | 0.9883 | 0.9882 | 0.9887 | 0.9879 |
Semantic Similarity
Project Name | Accuracy | F1 Score | Precision | Recall | Average Precision |
---|---|---|---|---|---|
Semantic Similarity of Quora Pairs Dataset - Base | 85.93 | 82.89 | 77.43 | 89.18 | 87.13 |
Semantic Similarity of Quora Pairs Dataset - Large | 88.72 | 85.22 | 80.72 | 90.25 | 89.75 |
- Metrics shown for Semantic Similarity are measured using Cosine-Similarity.
Text Summarization
Project Name | Rouge1 | Rouge2 | RougeL | RougeLsum |
---|---|---|---|---|
Flan-T5 - Text Summarization-Data Dataset (1 Epoch) | 43.6615 | 20.349 | 40.1032 | 40.1589 |
Flan-T5 - Text Summarization-Data Dataset (6 Epochs) | 43.5994 | 0.4446 | 40.132 | 40.1692 |
LED - Text Summarization-Data Dataset (4 Epochs) | 43.3689 | 19.9885 | 39.9887 | 40.0679 |
CNN News Text Summarization | 0.834343 | 0.793822 | 0.823824 | 0.823778 |
Text Summarization BBC News (with Pegasus Transformer) | 0.584474 | 0.463574 | 0.408729 | 0.408431 |
Machine Translation
Project Name | Transformer Checkpoint | Bleu | Rouge1 | Rouge2 | RougeL | RougeLsum | Meteor |
---|---|---|---|---|---|---|---|
English to French | facebook/mbart-large-50 |
35.1914 | 0.6420 | 0.4573 | 0.6070 | 0.6069 | 0.5917 |
English to German | facebook/mbart-large-50 |
35.5931 | 0.5803 | 0.3939 | 0.5439 | 0.5442 | 0.55 |
English to Spanish | facebook/mbart-large-50 |
41.4437 | 0.6751 | 0.4977 | 0.6372 | 0.6376 | 0.6479 |
BioMedical EN to IT Translation | facebook/mbart-large-50 |
38.9893 | 0.6826 | 0.4737 | 0.6586 | 0.6585 | 0.6270 |
Chinese to English Translation | Helsinki-NLP/opus-mt-zh-en |
45.2808 | 0.6201 | 0.4198 | 0.5927 | 0.5927 | - |
Korean to English | Helsinki-NLP/opus-mt-ko-en |
14.3395 | 0.4391 | 0.2022 | 0.3671 | 0.3671 | - |
Medical - German to English | Helsinki-NLP/opus-mt-de-en |
53.8812 | 0.7664 | 0.6284 | 0.7370 | 0.7370 | - |
Question & Answer
Project Name | Exact Match | F1 Score |
---|---|---|
ML QA | 59.6146 | 73.3002 |
Answer Prediction Dataset | 65.7357 | 79.2835 |
Generate Docstrings
Project Name | Model Checkpoint | Rouge1 | Rouge2 | RougeL | RougeLsum |
---|---|---|---|---|---|
CodeSearchNet Dataset to Generate Docstrings (Code T5 Project) | Salesforce/codet5-small |
0.3381 | 0.1541 | 0.3045 | 0.3214 |
Smol Dataset to Generate Docstrings | Salesforce/codet5-base |
0.4947 | 0.3661 | 0.4794 | 0.4791 |
Smol Dataset to Generate Docstrings | Salesforce/codet5-small |
0.38 | 0.2176 | 0.3554 | 0.3635 |
Multiple Choice
Project Name | Accuracy |
---|---|
CosmosQA | 0.6000 |
Social IQa | 0.6128 |
Discourse Marker QA | 0.6207 |
Figurative Language | 0.8124 |
Strategy QA | 0.625 |
e-CARE | 0.7212 |
Vitamin C Fact Verification | 0.7240 |
Winowhy | 0.7118 |
NLP Regression
Project Name | Mean Squared Error (MSE) | Root Mean Squared Error (RMSE) | Mean Absolute Error (MAE) |
---|---|---|---|
Edmunds Car Reviews - All Brands (with Bert-Base) | 0.2324 | 0.4820 | 0.3089 |
Edmunds Car Reviews - All Brands | 0.2232 | 0.4724 | 0.3150 |
Edmunds Car Reviews - Brands Headquartered in America | 0.2486 | 0.4986 | 0.3469 |
Edmunds Car Reviews - Brands Headquartered in Europe | 0.1999 | 0.4471 | 0.2824 |
Edmunds Car Reviews - Brands Not Headquartered in America or Europe | 0.2240 | 0.4733 | 0.3140 |
Episode Reviews/Rating - The Simpsons | 0.7632 | 0.8736 | 0.6622 |
Episode Reviews/Rating - The Simpsons & Other TV Shows | 0.3754 | 0.6127 | 0.4651 |
TMDB 5000 Move Dataset | 0.7613 | 0.8725 | 0.6848 |
Causal Language Modeling
Project Name | Perplexity |
---|---|
2000 Clean Medical Articles | 18.67 |
AG News (DistilGPT2 Version) | 31.53 |
AG News (GPT2 Version) | 22.92 |
US Economic News Articles | 31.41 |
Causal Language Modeling for Chatbot
Project Name | Perplexity |
---|---|
Large Company's FAQs (Medium) v1 | 8.67 |
Large Company's FAQs (Large) v1 | 2.79 |
Large Company's FAQs v2 | 1.70 |
Masked Language Modeling
Project Name | Perplexity |
---|---|
AG News | 5.95 |
Reddit Comments | 12.70 |
US Economic News Articles | 6.25 |
Footnotes:
- The output format for rouge was changed somewhere along the way of training all of these projects. As a result, rouge metric values under 1 should be multiplied by 100 to accurately compare to the values over 1.
- PoS stands for Part of Speech.
- Projects that are apart of transformer comparisons using the same dataset are denoted with an asterisk (*) at the end of their project name.