Implementation of various Fuzzy Search Algorithms for String Matching
- The dataset's 'Consolidated Codes' sheet contains codes for several products.
- The objective of this repository is to link all the data in 'Dynamic Data' sheet to its respective 'Code' and 'Description'
- Libraries
- Select required Libraries
- Required Functions
- Created Functions for Data cleaning, standardizing, stop word removal, punctuation removal and spelling correction.
- Data Loading
- Data Pre-Processing
- Removing case sensitivity
- Test Data standardization
- Cleaning:
- Convert 'pct' to '%'
- Convert 'ppm' to '%'
- Removal of numbers with metrics
- Cleaning:
- Some more Cleaning
- Stop Word Removal
- Spell Correction
- Model Comparison
- Sample Selection from Test Set
- Model 1
- Model 2
- Model 3
- Model 4
- Model 5
- Model Selection
- Model Prediction
- Results
- Save Output in an Excel Format