Adjusting data sent to api.wordpress.org
Closed this issue · 12 comments
Hi Yahnis,
It seems to me that when a plugin's updates are being handled by this class, the plugin's data should be stripped from what is sent to https://api.wordpress.org/plugins/update-check/(version)
in the built-in WordPress plugin updates check.
This avoids any issues from wordpress.org's API sending back unwanted/clashing data, and also deals with the unwanted data leak (for most people, I'd assume; for some websites depending on the nature of the website it might be considered a GDPR or other privacy-law issue; and commercial plugin vendors might also consider it an unwanted leak of their full active customer list) of a list of their non-wordpress.org plugins being sent to wordpress.org.
So, I'd propose that an appropriate hook be used to remove the entry for plugins using the class, since api.wordpress.org does not need to receive that information. In the call to the above URL, in the body (of the POST
request), there's a plugins
key, which is itself an array. The array key is the plugin relative path.
I'm happy to create a patch if it's something you'd consider merging.
David
As well as removing the entry from the plugins
array, the entry should be removed from the active
array too.
It seems there are two separate issues here.
Legal: PUC doesn't send that data to api.wordpress.org
. WordPress does. WordPress will do that even if you're not using PUC, and even if your plugin is inactive, which means you can't fully prevent it by using a filter. So if this is a legal issue for someone, they should take it up with WordPress core, not me.
Technical: I thought about this suggestion for a bit and couldn't come up with any practical drawbacks. I would be willing to merge something like this. It might be nice to include a custom filter that can turn off this behaviour since it's technically a backwards-incompatible change, but I'm not sure even that's necessary. I don't recall anyone actually wanting to still include their plugin in the default update requests while they're using PUC.
Hi,
Thank you. Yes, whether it's a legal issue for anyone isn't really our question. Like you, I've been working with WP plugins for a long time, and I can't think of why anyone would need this info to be included when they're using an external updates server.
Here's a PR: #579
N.B. I've never developed any custom themes, and don't have a testing environment for that. The above code worked for me in my plugins.
And if someone wants some mu-plugin code to filter this for all third-party plugins and themes on their site unconditionally (whether active or not), then this should work:
add_filter('http_request_args', function ($args, $url) {
$parsed_url = parse_url($url);
if (!isset($parsed_url['host']) || 'api.wordpress.org' !== strtolower($parsed_url['host'])) return $args;
//Uncomment these lines if you also want to remove your site URL from the data sent
//$args['user-agent'] ='WordPress';
//$args['headers'] = array();
foreach (array('plugins', 'themes') as $entity) {
$removed_keys = array();
if (!empty($args['body'][$entity])) {
$items = json_decode($args['body'][$entity], true);
foreach ($items[$entity] as $key => $item) {
// https://make.wordpress.org/core/2021/06/29/introducing-update-uri-plugin-header-in-wordpress-5-8/
if (!empty($item['UpdateURI']) && !preg_match('#^(https?://)?w(ordpress)?\.org#', $item['UpdateURI']) && 'false' !== $item['UpdateURI']) {
unset($items[$entity][$key]);
$removed_keys[] = $key;
}
}
}
if (!empty($removed_keys) && !empty($items['active'])) {
foreach ($removed_keys as $removed_key) {
unset($items['active'][$removed_key]);
}
$args['body'][$entity] = json_encode($items);
}
}
return $args;
}, 10, 2);
David
Thanks for merging. Since there's not been a release since February, any chance of one happening in the near future so that we can pull in all the commits since then without needing to use master?
I refactored some things, adjusted code style, and hopefully improved theme-related compatibility. It probably needs more testing. If you have time, take a look / try it out.
And sure, if I don't forget, I'll make a release later this week.
Thank you! Only 3 minor comments:
-
You added a further check on the precise URL. My reason for not doing that was that I can't imagine any reason, ever, why a non-wordpress.org plugin should be included in results sent back to wordpress.org. The code already carried out precise checks for the plugin file and only touched such data. So, it seemed to me unnecessary to narrow it down further, as the code will change nothing unless the precise plugin data is identified. Even if some future API is invented that's not an updates check, it still doesn't need non-wordpress.org plugins to be included in the data sent to wordpress.org. (Or if it does for someone, the filter is there).
-
For the same reasons, I don't see a reason to require a
1
in an API version number check. Since the plugin filename is precisely checked, and since it's known not to belong to wordpress.org, what harm can come out of omitting it? -
The filter name was changed to
remove_from_core_update_check
- but it's not a core update check, so that seems confusing.
Regarding precise checks: To me, this is a compatibility concern. Heuristics like "change only what you meant to change" and "minimize impact area". Sure, the additional checks are probably not necessary. But why leave extra room for errors when there's an easy way to avoid at least some? I don't want PUC to randomly start breaking things on people's sites if two years from now wordpress.org suddenly introduces a new search API that uses a field like ['themes']['keyword']
and some users happen to have installed a theme named "Keyword".
As for the new filter name, I'm not entirely happy about it either. I thought the original was too generic, so the new one mentions an "update check" which is performed by "core" (and the API is an update-check
API). However, it could be confused with a WordPress (core) version check. Maybe something like remove_from_default_update_checks
?
I haven't looked at and have no experience with the data format for themes. For plugins, the whole .php
filename from parent directory downward is there, so I thought that for plugins, it was precisely targetted: removing any references to that plugin from the data (whether it's an update check or not). But I agree in general, that being conservative is good.
Ah, I see, core as in WordPress core performing the check; I didn't think of that. Yes, I think your new suggestion is a good one.
A new release that includes this change is out now.
Thanks; I've tested it (plugins only), and it works for me.
Thanks to both for you for suggesting/making this change. With all the issues coming out around Matt/WPE recently, it is a good move to limit unnecessary data being sent to wordpress.org.
All right, I'll close this as completed. If you run into any bugs with this implementation, feel free to reopen or create a new issue.