Point out the privacy risks of timestamps in object IDs
mleonhard opened this issue · 3 comments
mleonhard commented
Non-random object IDs have privacy issues. Developers need to learn about these before they choose to use non-random IDs over simple random IDs. How about adding a PRIVACY section to the readme?
Timeflake encodes the precise time that a user created an object. Timestamps in IDs can reveal:
- User timezone.
- Geographic location: If the client software creates multiple associated IDs at the same time (like an article and embedded media), then the differences in timestamps of the IDs can reveal the latency of the client's network connection to the server. This reveals user geographic location. This can also happen if the client creates a single ID and the server adds an additional timestamp to the object.
- User identity (de-anonymizing)
- Most Android apps include Google's libraries for working with push notifications. And some iOS apps that use Google Cloud services also load the libraries. These Google libraries automatically load Google Analytics which records the names of every screen the users view in the app, and sends them to Google. So Google knows that userN switched from screen "New Post" to screen "Published Post" at time K.
- Some ISPs record and sell user behavior data. For example, SAP knows that userN made a request to appM's API at time K.
- Even if the posting app does not share its user behavior data with third-parties, the user could post and then immediately switch to an app that does share user behavior data. This provides data points like "userN stopped using an app that does not record analytics at time K".
- Operating Systems (Android, Windows, macOS) send user behavior data to their respective companies.
- Browsers and Browser Extensions send user behavior data to many companies. Data points like "userN visited a URL at example.com at time K" can end up in many databases and sold.
- Posting times combined with traffic analysis can perfectly de-anonymize users.
- How long the user took to write the post. This can happen if the app creates the ID when the user starts editing the post and also shares a timestamp of the publication or save time.
- Whether or not the user edited the post after posting it. This can happen if the posts's displayed time doesn't match the timestamp in the ID.
- Whether or not the user prepared the post in advance and set it to post automatically. If the timestamp is very close to a round numbered time like 21:00:00, it was likely posted automatically. If the posting platform does not provide such functionality, then the user must be using some third-party software or custom software to do it. This information can help de-anonymize the user.
anthonynsimon commented
Thanks for doing this. I think it’s a fantastic idea to help increase awareness on the privacy implications.
Would it be ok for you if I paste your points above directly into the readme? I think it’s great as you wrote it.
mleonhard commented
Yes. You’re very welcome. And thank you for sharing your code.
anthonynsimon commented
I added this to the readme. Once again, thanks!