cblevins/us-post-offices

Levenstein distance/similarity

bmschmidt opened this issue · 1 comments

Congrats on the book! And on getting this whole dataset out.

I was playing with this as a sample in a webGL library I'm working on--will send a copy when I get your data on the web, not just localhost--and got a little confused by the GNIS.dist field. Maybe worth clarifying that this isn't actually Levenshtein distance, but normalized Levenshtein similarity--i.e., if 0 is totally different and 1 is totally the same, the direction is flipped and it's a measure of similarity, not distance.

Thanks!! I was kind of hoping you'd plug the dataset into your webGL stuff!

And great catch, I struggled with exactly how to explain that field. But I think you're right - similarity is a better description than distance. I just pushed changes to that field name in the code and main files, and hopefully clarified a bit in the README.