Define a granularity at which location data is not sensitive
Opened this issue · 2 comments
We'd break significant aspects of the web if we hid a device's country (or legal jurisdiction) or time zone from websites. On the other hand, the user's current house number or even city block is too sensitive to reveal by default. Where's the border between those two kinds of locations?
I tentatively propose that city-level information is safe, and I believe Apple's Private Relay uses that as its default. We might encourage UAs to have a user control to make their location even more granular. Is "city level" a roughly-20-mile-square granularity, or does the area increase in less-populated areas?
Good question! A few thoughts:
- It seems like population is one key determinant — you used the word "city", for which Wikipedia says "working definitions for small-city populations start at around 100,000 people".
- It seems like we'd need to back off to some kind of hierarchical geographic classification in between cities and countries, but surely that varies by jurisdiction. For the US, people not in cities are in any case in states (min. pop. >500K).
Maybe we want something like "country, or a smaller geographical area within a well-established hierarchy within a country, provided that area's population is larger than P"? And then a threshold P somewhere between 100,000 and 500,000 matches my intuition.
I (as a random individual) think location data is always sensitive. Of course, that statement is useless to anyone who wants to anything so some more useful thoughts:
- In the privacy threat model, it might be worth adding a layer between "leaked with no user interaction" (non-sensitive information) and "leaked only with express user consent" (sensitive information). For instance, Firefox's ETP shield lights up when it is blocking content on the page. I feel like adding a new class of say "semi-sensitive information", putting the slightly anonymized location information in that class, and saying that user agents should only leak semi-sensitive information with a noticeable UX change might aid transparency while keeping usability.
- In the specific area of timezones, I like the brave Fingerprinting 3.0 idea of only leaking the UTC offset by default which could unbreak a lot of usecases while not leaking too much information.
(Sorry about the edits 😕, I really should form my thoughts better before hitting submit.)