/fideslog

Privacy-respecting usage analytics collection.

Primary LanguagePythonApache License 2.0Apache-2.0

Fideslog: Privacy-Respecting Usage Analytics

A part of the greater Fides ecosystem.

Latest Release Version Latest Deployment License Code style: black Checked with mypy Twitter

Fideslog banner

Overview

Fideslog is the API server, developer SDK, and supporting infrastructure intended to provide Ethyca with an understanding of user interactions with fides tooling. Analytics are only used either as a factor in Ethyca's internal product roadmap determination process, or as insight into product adoption. Information collected by fideslog is received via HTTPs request, stored in a secure database, and never shared with third parties for any reason unless required by law.

Using Fideslog

Ethyca's mission is to build trustworthy software. Fideslog requires that analytics data can only be collected after receiving explicit consent from the user. While the specific workflow to establish user consent will be unique to each application, the process must generally adhere to the following principles. Each SDK library exposes convenience functions, methods, and/or constants to assist with adherence to these guidelines:

  1. Notify the user of Ethyca's request for analytics data as soon as possible
  2. Describe the intended usage of the collected data clearly, completely, and concisely
  3. Provide a simple and accessible mechanism for a user to opt-out of data collection
  4. Do not repeatedly ask for consent; store the user's decision as state within the application or its configuration

The Fideslog API

The fideslog API server exposes the endpoints that handle interactions with analytics event data. For a complete list of available endpoints, as well as request requirements and response details, see the automatically generated API spec documentation.

The Fideslog SDKs

The official fideslog SDK libraries are the recommended means by which to automate the submission of analytics data to the fideslog API server from any application. For language-specific documentation, best practice recommendations, and code examples, see the dedicated README for each library:

Getting Started

Installation

The simplest way to run the API server locally is via Docker and Make. Ensure that both tools are installed, and clone this repository. Then, from the repository's root directory, run the following command:

make api

By default, this will start an instance of the fideslog API server on localhost:8080, and attach to the container. Log output will be written to stdout within the container instance.

📝 Note The API server will error when using only the provided fideslog.toml configuration file and no additional environment variables. See Enabling Database Access for Local Development below for configuration changes necessary to ensure a successful connection to the supporting database.

Configuration

The recommended way to configure the fideslog API server is with a fideslog.toml configuration file, but local environment variables may be used to override any values. The fideslog API server will look for a configuration file in the following locations (ordered by priority):

  1. A path defined by a FIDESLOG__CONFIG_PATH environment variable
  2. The current working directory
  3. The parent directory of the current working directory
  4. The user's $HOME directory

If no configuration file is found in any of the above locations, the default configuration values (specified in the table below) will be used. If any environment variables are set, they will continue to override the defaults.

Options

Name Configuration File Section Environment Variable Name Type Required Default Description
account [database] FIDESLOG__DATABASE_ACCOUNT String Yes The Snowflake account in which the fideslog database can be found. Ethyca employees may access this value internally.
database [database] FIDESLOG__DATABASE_DATABASE String No "raw" The name of the Snowflake database in which analytics events should be stored.
db_schema [database] FIDESLOG__DATABASE_DB_SCHEMA String No "fides" The Snowflake database schema to target.
encryption_key [database] FIDESLOG__DATABASE_ENCRYPTION_KEY String No "fides" The AES encryption key to use when encrypting user email addresses at rest.
password [database] FIDESLOG__DATABASE_PASSWORD String Yes The password associated with the Snowflake account for user. Ethyca employees may access this value internally.
role [database] FIDESLOG__DATABASE_ROLE String No "event_writer" The permissions with which to access the specified Snowflake database.
user [database] FIDESLOG__DATABASE_USER String Yes The ID of the user with which to authenticate to Snowflake. Ethyca employees may access this value internally.
warehouse [database] FIDESLOG__DATABASE_WAREHOUSE String No "fides_log" The Snowflake data warehouse in which the fideslog database can be found.
destination [logging] FIDESLOG__LOGGING_DESTINATION String No "stdout" The absolute path to a file or directory in which logs should be stored. If a directory is passed, a fideslog.log file will be created in that directory.
level [logging] FIDESLOG__LOGGING_LEVEL String No "INFO" The desired logging level. Accepts DEBUG, INFO, WARNING, ERROR, or CRITICAL. Case insensitive.
host [server] FIDESLOG__SERVER_HOST String No "0.0.0.0" The hostname on which the API server should respond.
hot_reload [server] FIDESLOG__SERVER_HOT_RELOAD Boolean No False Whether or not to automatically apply code changes during local development.
port [server] FIDESLOG__SERVER_PORT Integer No 8080 The port number on which the API server should listen.
request_rate_limit [server] FIDESLOG__SERVER_REQUEST_RATE_LIMIT String No "100/minute" The amount of requests allowed per IP address per unit time.
bucket_name [storage] FIDESLOG__STORAGE_BUCKET_NAME String Yes The name of the bucket to be used to store event data in.
region_name [storage] FIDESLOG__STORAGE_REGION_NAME String No The AWS region to be used. Optional in the case that the default AWS env var option is used.
aws_access_key_id [storage] FIDESLOG__STORAGE_AWS_ACCESS_KEY_ID String No The AWS access key to be used. Optional in the case that the default AWS env var option is used.
aws_secret_access_key [storage] FIDESLOG__STORAGE_AWS_SECRET_ACCESS_KEY String No The AWS secret access key to be used. Optional in the case that the default AWS env var option is used.

Example Configuration File

⚠️ WARNING Never commit changes to the included fideslog.toml file to version control!
# fideslog.toml

[database]
account = "--REDACTED--"
database = "raw"
db_schema = "fides"
password = "--REDACTED--"
role = "event_writer"
user = "--REDACTED--"
warehouse = "fides_log"

[logging]
level = "info"

[server]
host = "localhost"
hot_reload = true
port = 8080

[storage]
bucket_name = "fideslog-test"

Enabling Database / S3 Access for Local Development

The account, user, and password configuration options mentioned above must be populated for the fideslog API server to successfully connect to the supporting database. In regard to S3, the bucket_name, region_name, aws_access_key_id, and aws_secret_access_key will be required. Only Ethyca employees may access these values internally. For convenience, the included fideslog.env file will automate the process of populating the required values as environment variables, as long as the user's local environment includes the following:

# Add to .zshrc, .bash_profile, etc.

export SNOWFLAKE_ACCOUNT="--REDACTED--"
export SNOWFLAKE_DB_USER="--REDACTED--"
export SNOWFLAKE_DB_PASSWORD="--REDACTED--"
export FIDESLOG__STORAGE_AWS_DEFAULT_REGION="--REDACTED--"
export FIDESLOG__STORAGE_AWS_ACCESS_KEY_ID="--REDACTED--"
export FIDESLOG__STORAGE_AWS_SECRET_ACCESS_KEY="--REDACTED--"
export FIDESLOG__STORAGE_BUCKET_NAME="--REDACTED--"

Deployment

The creation of a new tag in this repository will trigger the deployment workflow via GitHub Actions.

In general, tags are only created as part of creating a new release. All releases must include a changelog. Any breaking changes to the API and/or SDK libraries will result in a new major version release/tag. To ensure compatibility, any pull requests resulting in breaking API changes must also include updates to all SDK libraries.

Learn More

The Fides core team is committed to providing a variety of documentation to help get you started using Fideslog. As such, all interactions are governed by the Fides Code of Conduct.

Support

Join the conversation on Slack and Twitter!

Contributing

We welcome and encourage all types of contributions and improvements! Please see our contribution guide to opening issues for bugs, new features, and security or experience enhancements.

Read about the Fides community or dive into the development guides for information about contributions, documentation, code style, testing and more. Ethyca is committed to fostering a safe and collaborative environment, such that all interactions are governed by the Fides Code of Conduct.

License

The Fides ecosystem of tools (Fideslog, Fidesops and Fidesctl) are licensed under the Apache Software License Version 2.0. Fides tools are built on Fideslang, the Fides language specification, which is licensed under CC by 4.

Fides is created and sponsored by Ethyca: a developer tools company building the trust infrastructure of the internet. If you have questions or need assistance getting started, let us know at fides@ethyca.com!