astronomer/astronomer-cosmos

Sensitive information being logged by ProfileConfig

Closed this issue · 4 comments

Context
While using Airflow connections via Cosmos’ profile mappings we are seeing sensitive information being dumped into airflow logs. When run_command is executed, ensure_profile() of ProfileConfig is called and the content of the profile is logged. See code here https://github.com/astronomer/astronomer-cosmos/blob/main/cosmos/config.py#L295

Since we are using GoogleCloudServiceAccountDict mapping with a service account connection the content of the svc keyfile is logged into airflow.

Goal
We can always extend the behaviour of ProfileConfig ourselves and override that logging behaviour, however I would like to confirm if I am doing something that I shouldn't because this certainly is an odd practice.

Thanks in advance.

To address the issue of sensitive information being logged by ProfileConfig when using Airflow connections via Cosmos’ profile mappings, you can utilize the use_mock_values parameter in the ensure_profile method. By setting use_mock_values=True, the method will not log sensitive details, which is particularly useful when dealing with GoogleCloudServiceAccountDict mappings. This solution does not require extending or overriding ProfileConfig, providing a straightforward way to prevent the logging of sensitive information without altering the existing codebase. For implementation, refer to the ensure_profile method in the Cosmos repository.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@jsborralho thank you very much for reporting this issue!

Could you confirm which version of Cosmos you're using?

Cosmos has support to defining sensitive information, and in those cases it is not hard-coded in the profile nor printed, but added to an environment variable:

secret_fields = ["private_key_id", "private_key"]

You can see this being successfully used in Postgres:

secret_fields = [
"password",

Would you like to contribute to improving the behaviour in GoogleCloudServiceAccountDictProfileMapping?

Ola @tatiana, thanks for the quick support.

Will do.
I will give a deeper look to the secret_field logic since its quite unknown to me.

Obrigado!

Está ótimo, obrigada, @jsborralho :)