bdefore/protondb-data

Include anonymous user IDs

Opened this issue · 4 comments

People are using these numbers to represent distribution popularity. There are a couple of problems with trying to do that —mostly that these are reporters not all users— but the other is that single users can submit at different rates. ProtonDB dump makes it hard to work out distinct user trends.

I realise you're probably using Steam IDs internally but could you attach an anonymised user ID for these reports? Like count up for each new distinct user , or make a random map, or even just increment them all by some secret number.

Some uniqueness could be derived, to a degree, by system information.

As for a unique identifier per user, I'm curious what others think about this. Reports made since October require Steam login so it is possible for those, but it would make exports arguably less anonymous than previously.

Also, if the goal is to make more accurate summaries of distro distribution, note that some users file reports across multiple distros.

Thanks for replying so quickly.

The rolling releases change their system information much more regularly so trying to to infer users from that seems unnecessarily messy.

That's where linking the reports —anonymously, one way or another— to their Steam IDs helps. A user can have a dozen installs, report under each, and it's still possible to work out who's using what. You could either count them by their last report, or fractional usage based on how many reports for each distro.

I think there are ways to anonymise Steam IDs securely. Obviously all bets are off if they name themselves in the report data, but I think that's a given.

@bdefore I would like some kind of anonymous id (anything will do, really) so that we can rely on it instead of system configuration, which is not perfect.