This notebook is about typical challenges when you have a large dataset and need to find the name of a specific customer, an address or want to filter for a defined date or period.
So one typical business case could be a client in your address list and you are not sure if this person or company is one or several times in your dataset, which will result in e.g.
- extra costs when you send out Marketing material, catalogues or presents,
- look unprofessional and you could risk to lose this contact,
- more unused space in your datasets or "file corpses"
- multiple invoices, different customer numbers and also multiple employees responsible for the same customer
Notebook: ipynb
Dataset: The used file is from Kaggle
pyenv python==3.9.4 Setup
For this purpose you use following commands:
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt