/anaconda-anon-usage

Anonymous usage telemetry for the conda client

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

anaconda-anon-usage

Simple, anonymous telemetry for conda

This package augments the request header data that conda delivers to package servers during index and package requests. Specifically, three randomly generated tokens are appended to the "user agent" that Conda already sends with each request.

These tokens are designed to reveal no personally identifying information. And yet, they enable Anaconda to better disaggregate individual user patterns from our access logs. Use cases include:

  • Counting the number of conda clients on a network
  • Providing more accurate estimates of package popularity
  • Other statistical analyses of conda usage patterns; e.g., environment count, frequency of updates, etc.

This package is installed as a dependency of certain Anaconda-branded packages, such as Navigator. While we ask that you allow us to gather this data to help us improve our user experience, the additional behavior can still be disabled with a single conda config.

Installation

You will likely not need to install anaconda-anon-usage yourself, as it will come as a dependency of other Anaconda packages such as anaconda-navigator. Nevertheless, it can readily be installed as follows:

conda install -n base anaconda-anon-usage

This package has no additional dependencies other than conda itself. It employs a conda pre-command plugin to modify the user agent string.

Explaining the behavior

The easiest way to verify that it is engaged is by typing conda info and examining the user-agent line. The user-agent string will look something like this (split over two lines for readability):

conda/22.11.1 requests/2.28.1 CPython/3.10.4 Darwin/22.2.0
OSX/13.1 aau/0.1.0 c/sgzzP8ytS_aywmqkDTJ69g s/3ItHH93LRUmCoJZkMOiD3g e/hCmim1vFSbinlm4waR6dZw

The first five tokens constitute the standard user-agent string that conda normally sends with HTTP requests, and are similar to what a standard web browser such as Chrome, Safari, or Edge send every time a page is requested. The last four tokens, however, are generated by anaconda-anon-usage package:

  • The version of the anaconda-anon-usage package.
  • A client token sgzzP8ytS_aywmqkDTJ69g is generated once by the conda client and saved in the ~/.conda directory, so that the same value is delivered as long as that directory is maintained.
  • A session token s/3ItHH93LRUmCoJZkMOiD3g is generated afresh every time conda is run.
  • An environment token e/hCmim1vFSbinlm4waR6dZw generated uniquely for each separate conda environment (-n <name> or -p <prefix>).

Here is an easy way to see precisely what is being shipped to the upstream server on Unix:

conda clean --index --yes
conda search -vvv fakepackage 2>&1 | grep 'User-Agent'

This produces an output like this:

> User-Agent: conda/22.11.1 requests/2.28.1 CPython/3.10.4 Darwin/22.2.0 OSX/13.1 aau/0.1.0 c/sgzzP8ytS_aywmqkDTJ69g s/3ItHH93LRUmCoJZkMOiD3g e/hCmim1vFSbinlm4waR6dZw

Anonymous token design

These standard three tokens are design to ensure that they do not reveal identifying information about the user or the host. Specifically:

  • Each token is generated from a uuid4 which uses os.urandom data and is encoded with base64 in an URL safe output.
  • The client token is saved in ~/.conda/aau_token, so that it can be read with every conda command. If for some reason the token cannot be created or read, the c/ token will be omittted.
  • Similarly, the environment token is saved in $CONDA_PREFIX/etc/aau_token, so it can be read with every conda command applied to that environment. If for some reason the token cannot be created or read, the e/ token will be omitted.

In short, these tokens were design so that they cannot be used to recover an underlying username, hostname, or environment name. The underlying purpose of these tokens is disaggregation: to distinguish between different users, sessions, and/or environments for analytics purposes. This works because the probability that two different users will produce the same tokens is vanishingly small.

Disabling

Because this package is delivered as a dependency of other Anaconda packages, you may not be able to remove it from your conda environment. You may, however, disable the delivery of the three tokens

conda config --set anaconda_anon_usage off

(false or no may also be used). With this setting in place, the additional tokens will be removed; e.g.

user-agent : conda/22.11.1 requests/2.28.1 CPython/3.10.4 Darwin/22.2.0 OSX/13.1 aau/0.1.0

To re-enable it, you may use the command

conda config --set anaconda_anon_usage on

(true or yes may also be used).