Given the example CSV file sample.csv, when executed this script will scan each record and compare them giving them a score based on how likely it is they're duplicated contacts.
- Email: if email is the same, we're assuming both contacts are the same. If it's the same username (i.e. the part before the TLD) we give it a high probablity score.
- Name: the code takes into account the possibility of both names to be the same but in different order and passing only initials
- Zip code and address: same zip code and address gives a higher probability of duplicate
node index.js