Bastian/bstats-metrics

GDPR infringement

Pingger opened this issue Β· 30 comments

Hi,

TL/DR: the default enabled: true is illegal

Please be aware, that in most cases the defaults of bStats are infringing the Laws outlined in the GDPR.
Most specifically: the collected stats are published by default.
According to the GDPR: You are required to ask the serveradmin and ALL Clients connecting to it to be included in those usage stats.

For further reading I can recommend the official website for the GDPR: https://gdpr.eu/gdpr-consent-requirements/

Greetings

PS: You might also want to rethink the default text "There is no performance penalty" as this is just fundamentally wrong! Better would be "There is no noticeable performance penalty"

The GDPR only regulates the collection, use, and storage of personal data. Under the GDPR, personal data is defined as any information relating to an identified or identifiable natural person.

Can you please elaborate where you think bStats collects personal data?

PS: You might also want to rethink the default text "There is no performance penalty" as this is just fundamentally wrong! Better would be "There is no noticeable performance penalty"

While I would say that this is nitpicking (since the impact is really really minor), it's technically correct and I will change the sentence.

The timings with which a specific server has how many users, and implicitly how long those users play. The server is uniquely identified with the uuid.

PS: in addition just by connecting to your server and thus exposing the IP of the minecraft server. the IP is very specifically a personally identifiable information

A minecraft server is not a person

It is not possible to identify specific individuals using just the raw player count data. Therefore, the player count data can be considered anonymized.

See Recital 26 that explicitly states that the GDPR is not applicable to anonymous data:

The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.

Source: https://gdpr.eu/recital-26-not-applicable-to-anonymous-data/

PS: in addition just by connecting to your server and thus exposing the IP of the minecraft server. the IP is very specifically a personally identifiable information

IP addresses from servers cannot be used to identify specific individuals, therefore the GDPR does not apply in this case. This is in contrast to IP addresses that come directly from users, which are considered personal data under the GDPR. Most sources are referring to this type of IP address when they discuss the use of IP addresses as personal data.

However, there is an edge case where users may run Minecraft servers locally on their own personal devices, rather than on a dedicated server. This is unintended and I will try to implement heuristics to identify these devices (e.g. by checking if they are running a desktop operating system such as Windows) and do not send data in these cases.

I have created a new issue specifically for data from user devices and will prioritize addressing it: #113

However, there is an edge case where users may run Minecraft servers locally on their own personal devices, rather than on a dedicated server. This is unintended and I will try to implement heuristics to identify these devices (e.g. by checking if they are running a desktop operating system such as Windows) and do not send data in these cases.

Hold up this is still ridiculous.

Are you allowed to store the IP addresses on your bStats server without publishing them? Probably not. But you can identify them uniquely, but not identify them, just by their unique server ID from bStats. If you reasonably stream incoming data to an aggregate value (and do not remember all the values belonging to a specific server separately) it becomes positively impossible to ever identify anything or anyone based on what you have stored.

To your final note, in the event that a user is running the Minecraft server on their personal home network or computer, they are specifically and intentionally deciding to publish their IP address for the world to see (in order for players to connect to it). bstats, in that case, is only seeing data that is already publicly accessible.

Maybe there are some pretty unique setups that might put bstats in a bad spot for compliance, but those are edges of edge cases. Personally, enabling by default isn't what I prefer, but that's for entirely different reasons.

It may violate the GDPR or or it may not. I am not completely comfortable with this edge case, so I will try to eliminate it. bStats was never intended to collect data from user devices and excluding these edge cases is technically feasible.

To your final note, in the event that a user is running the Minecraft server on their personal home network or computer, they are specifically and intentionally deciding to publish their IP address for the world to see (in order for players to connect to it). bstats, in that case, is only seeing data that is already publicly accessible.

(I am on your side, but this is not accurate - If I publish my real life address myself, it is still forbidden for anyone to take it and do something with it, under GDPR. The voluntariness is irrelevant, there must be specific processing and storage consent per purpose.)

Well, it's a matter of how and what is being stored and processed. I don't think we need to go much further into it here since Bastion has made the decision to address the issue, but you're right that my comment was incomplete and misleading.

IP addresses from servers cannot be used to identify specific individuals,

Yes, they can! They even are specifically mentioned! https://gdpr.eu/eu-gdpr-personal-data/

[...]Fortunately, the GDPR provides several examples in Recital 30 that include:

Internet protocol (IP) addresses; [...]

Maybe not in america, but in the EU, you, as a server owner, have to be identifiable. Additionally you, as a minecraft-server-hoster, have to be identifiable by your users, since minecraft-server collect login- and other stats locally, in respect of data protection laws.

Additionally the website is forcefully loading parts of it from google and other servers (read third-party-servers), without previous user consent (read cookie-terror-banner). (and yes usage of a third-party CDN requires user consent. in case you didn't notice or aren't from the EU, on many websites quite a bit doesn't load until you clicked a button indicating that you read the privacy notice). You might want to be aware, that you mustn't store or send any user data, anonymized or not, from EU residents outside the EU without explicit user consent. Specifically not in the US at the moment, as outlined by the European Court of justice, as can easily be researched here: https://en.wikipedia.org/wiki/EU%E2%80%93US_Privacy_Shield and here https://en.wikipedia.org/wiki/Trans-Atlantic_Data_Privacy_Framework

(@cafestifflered)
To your final note, in the event that a user is running the Minecraft server on their personal home network or computer, they are specifically and intentionally deciding to publish their IP address for the world to see (in order for players to connect to it). bstats, in that case, is only seeing data that is already publicly accessible.

What about a local Server, that isn't accessible from the internet? The only thing leaking the IP would be bStats (or other plugins) ...

But just in general: For ANY Data you collect, be it personally-identifiable or not, or 'legitimate interest' data, that is not required for basic operation, the user has to be informed AHEAD of time and given the opportunity to refuse. This opportunity must be similarly easy as agreeing as Google found out a few month ago, when they lost their case with their cookie-banner, that had an easy "Agree"-Button, but to disagree you had to click into a submenu.

The timings with which a specific server has how many users, and implicitly how long those users play. The server is uniquely identified with the uuid.

PS: in addition just by connecting to your server and thus exposing the IP of the minecraft server. the IP is very specifically a personally identifiable information

The IP would fall under legitimate interest, since it's only used for ratelimiting purposes as far as I know

The timings with which a specific server has how many users, and implicitly how long those users play. The server is uniquely identified with the uuid.
PS: in addition just by connecting to your server and thus exposing the IP of the minecraft server. the IP is very specifically a personally identifiable information

The IP would fall under legitimate interest, since it's only used for ratelimiting purposes as far as I know

While it does fall under legitimate interest, as stated before, you still have to inform the user ahead of time, with an easy option to object.

My personal suggestion, for whatever it is worth without knowing the future technical plans, would be to make an attempt to ditch storing the IP and keep all the other data (and not storing the provided data points in relation to the server ID, just aggregating them).

I don't think any of the other data falls under GDPR, particularly if forbidding plugin authors from doing so in ToS.

Additionally the website is forcefully loading parts of it from google and other servers (read third-party-servers), without previous user consent (read cookie-terror-banner).

100% agreement here. There is no reason for bStats to load this stuff from third-party CDNs instead of serving the data itself. The code is older than the GDPR and I never updated it. This is of course no excuse and I will address this issue soon and remove all third-party scripts and fonts from bStats.

Yes, they can! They even are specifically mentioned! https://gdpr.eu/eu-gdpr-personal-data/

As previously mentioned, these kind of sources typically refer to IP addresses from user devices when discussing their use. Server IP addresses are a rare exception that does not impact most readers, so they are not explicitly distinguished. Server IP addresses are not personal data.

Maybe not in america, but in the EU, you, as a server owner, have to be identifiable. Additionally you, as a minecraft-server-hoster, have to be identifiable by your users, since minecraft-server collect login- and other stats locally, in respect of data protection laws.

I do not have access to this information and am not reasonably able to access it. From your source:

Indirect identification means you cannot identify an individual through the information you are processing alone, but you may be able to by using other information you hold or information you can reasonably access from another source.


While it does fall under legitimate interest, as stated before, you still have to inform the user ahead of time, with an easy option to object.

This is just not true. For legitimate interest the user does not have to have an option to opt-out. Consent is just one of 6 options when you are allowed to process personal data. See https://gdpr.eu/article-6-how-to-process-personal-data-legally/ (Quote: "Processing shall be lawful only if and to the extent that at least one of the following applies"). This is even what your link in the issue text starts with:

Contrary to popular belief, the EU GDPR (General Data Protection Regulation) does not require businesses to obtain consent from people before using their personal information for business purposes. Rather, consent is just one of the six legal bases outlined in Article 6 of the GDPR. Businesses must identify the legal basis for their data processing.

@MartijnMuijsers

to make an attempt to ditch storing the IP and keep all the other data

IP addresses are neccessary for ratelimiting. This is nothing that can reasonably be removed. The data is stored as short as possible (at most 30 minutes - the ratelimit duration).

and not storing the provided data points in relation to the server ID, just aggregating them

This is already how it works. Data is only aggregated and not linked to a server id or ip.

IP addresses are neccessary for ratelimiting. This is nothing that can reasonably be removed. The data is stored as short as possible (at most 30 minutes - the ratelimit duration).

Ah I see okay. Then it's certainly legitimate interest. (Otherwise plenty a web server in general would run afoul of GDPR.)

This is already how it works. Data is only aggregated and not linked to a server id or ip.

Oh, awesome. Good insight in general.

I trust you to make the right call. I would just humbly request to keep collecting as much of the standard data from personal systems as you feel you can. Maybe it's that I ran a server off of a Windows machine for years myself, or that I've referenced the global used version and cores from the website occasionally feeling that it also represented those types of servers, but I feel that it counts to have it represented.

A Webserver can use an IP as a legitimate interest, because without an IP Address it wouldn't know where to send its response to. Or in other words: The IP is required to provide to provide its basic functionality.

On the other hand: bStats is not required for the basic functionality for almost all if not all plugins. But maybe I'm blaming the wrong guy. maybe I should instead go to every plugin dev using this library, while it is not required for basic functionality. But I though it would be a much better Idea to ask for the change at the source.

You see, a plugin using bStats is functionally the same as using a CDN on a website according to the Reasoning of the European Court of Justice. (because it transmits Data to a third-party)

A Webserver can use an IP as a legitimate interest, because without an IP Address it wouldn't know where to send its response to. Or in other words: The IP is required to provide to provide its basic functionality.

On the other hand: bStats is not required for the basic functionality for almost all if not all plugins. But maybe I'm blaming the wrong guy. maybe I should instead go to every plugin dev using this library, while it is not required for basic functionality. But I though it would be a much better Idea to ask for the change at the source.

You see, a plugin using bStats is functionally the same as using a CDN on a website according to the Reasoning of the European Court of Justice. (because it transmits Data to a third-party)

Not to disprove any points, just some of my views on the matter. While BStats is not required for basic functionality, it does serve as a unified system for data collection, if BStats weren't a thing, then many plugins would device their own ways to achieve the same, which would likely lead to worse privacy and security for servers.

For many plugin authors the information that BStat collects is very important on deciding what versions to support Etc, and disabling bstats by default would likely lead to a high drop in transmitted metrics, and this a drop in the reliability of them.

Of course it is important to comply with privacy guidelines, so I feel a middle ground will have to be found

Just to be clear, I am not against a unified data collection system. My suggestion would be "nagging" the OPs until one makes the choice to either allow the stats or not.
e.g.
2 config values:
enabled: true/false (default false)
admin-responded: true/false (default false)
And asking the OPs and the console on each join to either run /bstats yes or /bstats no and with this setting the 2 config value accordingly. (until an admin did respond with either, that is what the second config values would be for)

maybe I should instead go to every plugin dev using this library, while it is not required for basic functionality.

While I agree with you that privacy is important and often neglected on the internet, I recommend you to rethink your priorities: you are talking to/about hobby developers with good intentions who do their work for no profit and no financial interest (ignore all those sometimes dubious paid plugins). On the other hand, the internet is full of million, billion and even trillion dollar companies (and countries! Think Edward Snowden) that violate the GDPR on a daily basis and disregard user privacy for their financial gain. These companies collect much more information, which is also much more sensitive. These are also the entities from which the data is actually worth protecting.

I know this is a "whataboutism" argument, and I appreciate pointing out obvious problems like the unnecessary use of CDNs. But please keep in mind that I and most plugin developers do this with their best of intentions and no financial gain. We (hobby developers) are not the ones people with privacy concerns should worry about.

In my attempt to provide a PR with previously suggested functionality, I noticed an issue: With multiple Plugins using bStats, every Plugin would nag the Admins. In Bukkit, bungeecord and velocity this is quite easy to resolve (I just check whether the command is already defined). Sponge unfortunately doesn't have documentation for their Command API. (Further investigation shows, Sponge already has a properway to handle this!)

So the big question: should I go ahead?

In my attempt to provide a PR with previously suggested functionality, I noticed an issue: With multiple Plugins using bStats, every Plugin would nag the Admins. In Bukkit, bungeecord and velocity this is quite easy to resolve (I just check whether the command is already defined). Sponge unfortunately doesn't have documentation for their Command API. (Further investigation shows, Sponge already has a properway to handle this!)

So the big question: should I go ahead?

I wonder if it's really such a bad thing to have all plugins nag the admin, sure it would be annoying, but on the other hand it does push the admin to act, and in addition it gives insight in which plugins send their data to BStats, which currently is not really clear (as far as I know).

No, what I mean is, you agree once, all plugins with automatically are bstats enabled. but when the consent or objection has not yet been given, every plugin would just print the same text "on top of eachother", to issue the same exact command... but for now that can easily be worked around.

No, what I mean is, you agree once, all plugins with automatically are bstats enabled. but when the consent or objection has not yet been given, every plugin would just print the same text "on top of eachother", to issue the same exact command... but for now that can easily be worked around.

I get what you mean, my (perhabs simplistic) reasoning is just more spam = more urgency to make a choice. Anyways, regardless of that, it sounds like a good addition (perhabs with clickable buttons in chat that run the command).

Hi, I'm a professional amateur lawyer who has had way too much exposure to legal issues thanks to a small group of people. (I really don't like these people.)

TL/DR: the default enabled: true is illegal

Please be aware, that in most cases the defaults of bStats are infringing the Laws outlined in the GDPR.
Most specifically: the collected stats are published by default.
According to the GDPR: You are required to ask the serveradmin and ALL Clients connecting to it to be included in those usage stats.

tl;dr: no.

Surely Bastian coule add some features to allow a more privacy-friendly use of bStats. Especially the point that you can decide per plugin/server would be great.

But the core question of whether this feature is "illegal" is misrepresented.
Not bStats or Bastian is responsible for the consent, but the plugin developer respectively the server operator.
Similar situation with Google Analytics:
A website uses Google Analytics to collect statistics. For this purpose, Google analyzes and aggregates the collected data. In the case of Google, these are further used for AdSense.

Or with bStats:
A plugin uses bStats to collect statistics. For this purpose, bStats analyzes and aggregates the collected data. In the case of bStats, these are published anonymously.

For the declaration of consent is in both cases in my opinion not Google or bStats responsible, but rather the operator of the website or the developer of the plugin.

The accusation that bStats is acting illegally with this is therefore incorrect in my opinion.

And a server IP is not a suitable way to identify a person. Certainly, there would be the edge case that the server could be operated on a private computer and thus a private IP address could be passed on, but such a case differentiation would not be possible without doubt and would involve an unreasonable effort.

I agree with you about the CDN thing, but he promised improvements there.

btw. Opt-out via command or GUI through bStats, is the completely wrong approach in my opinion. bStats excels at being very lightweight and having minimal impact on the server.

An opt-in or opt-out via command or GUI should therefore be regulated in a separate plugin and not in the API.

Kindly Regards
Übersetzt mit DeepL https://www.deepl.com/app/?utm_source=android&utm_medium=app&utm_campaign=share-translation

Disclaimer: I'm neither a lawyer nor have I studied law. Should already be visible based on my wording lol.

So the big question: should I go ahead?

No. At present, I do not plan to switch to an opt-in system.

Edit: A previous version of the comment said "opt-out". This way a typo and has been corrected to "opt-in".

tbh I don't see any Personally identifiable information being collected here by stats. Server IP's will always be public soon as it goes live unless it is localhosted and unexposed to the internet.

PS. but then configs for plugins that has options to disable bstats are not created unless server is booted up. There should be an opt-in/out on this or should I say the plugin should not enable bstats by default on its first server boot up.

  1. IP is personally identifiable as per the GDPR: https://www.gdpreu.org/the-regulation/key-concepts/personal-data/
  2. markusmarkusz what now, are you a professional lawyer or not? ("professional" doing something as a paying job, "amateur" not doing something as a paying job, the exact opposite of "professional")
  3. Again, the issue is, that you can't opt-out before the data is transmitted. The implementing plugin developer is at fault for just copying bStats over. But it would be smart to update bStats to avoid Plugin-Devs infringing on the GDPR
  4. Regarding the comparison to Google Analytics: You ever wondered what those "Cookie"-Nag-Screens are? Those are to get consent from the user! Which I am requesting bStats-Plugins to also adhere to!
  5. Bastian what do you mean "At present, I do not plan to switch to an opt-out system."?! You currently have an opt-out system. That is where the entire issue lies.
  6. Saying "most users aren't affected" isn't a valid excuse according to the GDPR reasonings

I proposed several solutions to the issue and even offered to implement some of those.

Closing this, because of entire lack of concern by Owner.

I seriously despise people only reading the part they want and ignoring everything else (looking at the repo-owner, his tagging of this repo really describes their mental capacity)

  1. Bastian what do you mean "At present, I do not plan to switch to an opt-out system."?! You currently have an opt-out system. That is where the entire issue lies.

Sorry, for the confusion. I meant to write "opt-in". I have no plans to switch to an opt-in system.

  1. IP is personally identifiable as per the GDPR: https://www.gdpreu.org/the-regulation/key-concepts/personal-data/

I'm not going to discuss this again.

Read your own source:

There is a common assumption that according to the GDPR, all organizations must obtain consent in order to process personal data, but this is not the case. Consent is just one of the options that companies have, as this article has shown, and in fact, it is not always the best option.

bStats has a legitimate interests to process and store IP addresses for ratelimiting purposes. When a legitimate interest exists, active constent is not necessary.