log2timeline/dftimewolf

Google Cloud logging libraries misconfiguration

AnatoliyNelyubin opened this issue · 7 comments

After clean installation of all requirements pip install -r requirements.txt in a new python environment:

Successfully installed adal-1.2.7 aiohttp-3.8.1 aiosignal-1.2.0 altair-4.2.0 amqp-5.1.1 async-timeout-4.0.2 attrs-21.4.0 azure-common-1.1.28 azure-core-1.24.2 azure-identity-1.10.0 azure-mgmt-compute-23.1.0 azure-mgmt-core-1.3.1 azure-mgmt-monitor-3.1.0 azure-mgmt-network-20.0.0 azure-mgmt-resource-21.1.0 azure-mgmt-storage-20.0.0 azure-storage-blob-12.13.0 beautifulsoup4-4.11.1 billiard-3.6.4.0 boto3-1.24.31 botocore-1.27.31 cachetools-4.2.4 celery-5.2.7 certifi-2022.6.15 cffi-1.15.1 chardet-4.0.0 charset-normalizer-2.1.0 click-8.1.3 click-didyoumean-0.3.0 click-plugins-1.1.1 click-repl-0.2.0 cmd2-2.4.2 colorlog-6.6.0 cryptography-3.3.2 deprecated-1.2.13 docker-5.0.3 ecdsa-0.18.0 entrypoints-0.4 filelock-3.7.1 frozenlist-1.3.0 future-0.18.2 gax-google-logging-v2-0.8.3 gax-google-pubsub-v1-0.8.3 gcloud-0.18.3 google-api-core-1.32.0 google-api-python-client-2.53.0 google-auth-1.35.0 google-auth-httplib2-0.1.0 google-auth-oauthlib-0.5.2 google-cloud-bigquery-1.20.0 google-cloud-core-1.7.3 google-cloud-datastore-2.0.0 google-cloud-error-reporting-1.5.2 google-cloud-logging-2.3.1 google-cloud-pubsub-1.7.0 google-cloud-storage-2.2.1 google-crc32c-1.3.0 google-gax-0.12.5 google-resumable-media-2.3.3 googleapis-common-protos-1.56.0 grpc-google-iam-v1-0.12.4 grpc-google-logging-v2-0.8.1 grpc-google-pubsub-v1-0.8.1 grpcio-1.47.0 grr-api-client-3.4.6.post0 grr-response-proto-3.4.6.post0 httplib2-0.20.4 idna-2.10 isodate-0.6.1 jinja2-3.1.2 jmespath-1.0.1 jsonschema-4.7.2 kombu-5.2.4 kubernetes-24.2.0 libcloudforensics-20220718 libcst-0.4.7 markupsafe-2.1.1 msal-1.18.0 msal-extensions-1.0.0 msrest-0.7.1 msrestazure-0.6.4 multidict-6.0.2 mypy-extensions-0.4.3 netaddr-0.8.0 networkx-2.8.3 numpy-1.23.1 oauth2client-4.1.3 oauthlib-3.2.0 packaging-21.3 pandas-1.4.3 ply-3.8 portalocker-2.5.1 prettytable-3.3.0 prometheus-client-0.14.1 prompt-toolkit-3.0.30 proto-plus-1.19.6 protobuf-3.12.2 psq-0.2.1 psutil-5.9.1 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycparser-2.21 pycryptodome-3.15.0 pycryptodomex-3.15.0 pyjwt-1.7.1 pyopenssl-21.0.0 pyparsing-2.4.7 pypdf2-2.6.0 pyperclip-1.8.2 pyrsistent-0.18.1 python-dateutil-2.8.2 pytz-2022.1 pyyaml-6.0 redis-4.3.4 requests-2.25.1 requests-oauthlib-1.3.1 rsa-4.8 s3transfer-0.6.0 six-1.16.0 soupsieve-2.3.2.post1 sshpubkeys-3.3.1 timesketch-api-client-20210602 timesketch-import-client-20210602 toolz-0.12.0 turbinia-20220216 typing-extensions-4.3.0 typing-inspect-0.7.1 uritemplate-4.1.1 urllib3-1.26.10 vine-5.0.0 vt-py-0.14.0 wcwidth-0.2.5 websocket-client-1.3.3 werkzeug-1.0.1 wrapt-1.12.1 xlrd-2.0.1 yarl-1.7.2

GCP logging in this venv stops working:

$ python -c 'import google.cloud.logging; client = google.cloud.logging.Client()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'google.cloud.logging' has no attribute 'Client'

in the installed version google-cloud-logging-2.3.1 the attribute shouldn't be missing (see source) I am thinking that there might be another library around that is making google.cloud.logging a namespace package, can you try installing in a new virtualenv and verify if you get this same error?

Thank's for your answer! I am afraid that the issues started appearing even on the installation phase. On a completely fresh Ubuntu 22.04 install -r requirements.txt does not properly install all packages:

Collecting typing-extensions>=4.0.1
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==. These do not:
    typing-extensions>=4.0.1 from https://files.pythonhosted.org/packages/ed/d6/2afc375a8d55b8be879d6b4986d4f69f01115e795e36827fd3a40166028b/typing_extensions-4.3.0-py3-none-any.whl#sha256=25642c956049920a5aa49edcdd6ab1e06d7e5d467fc00e0506c44ac86fbfca02 (from azure-core==1.24.2->-r requirements.txt (line 95))

After forcefully installing

pip install https://files.pythonhosted.org/packages/ed/d6/2afc375a8d55b8be879d6b4986d4f69f01115e795e36827fd3a40166028b/typing_extensions-4.3.0-py3-none-any.whl#sha256=25642c956049920a5aa49edcdd6ab1e06d7e5d467fc00e0506c44ac86fbfca02

there is another error:

Collecting googleapis-common-protos[grpc]<2.0.0dev,>=1.56.0
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==. These do not:
    googleapis-common-protos[grpc]<2.0.0dev,>=1.56.0 from https://files.pythonhosted.org/packages/e2/fd/d9efa2085bd762fba3a637eb3e36d76d72eb6e083d170aeaca65a75f1f9c/googleapis_common_protos-1.56.4-py2.py3-none-any.whl#sha256=8eb2cbc91b69feaf23e32452a7ae60e791e09967d81d4fcc7fc388182d1bd394 (from grpc-google-iam-v1==0.12.4->-r requirements.txt (line 421))

after installing:

pip install https://files.pythonhosted.org/packages/e2/fd/d9efa2085bd762fba3a637eb3e36d76d72eb6e083d170aeaca65a75f1f9c/googleapis_common_protos-1.56.4-py2.py3-none-any.whl#sha256=8eb2cbc91b69feaf23e32452a7ae60e791e09967d81d4fcc7fc388182d1bd394
Collecting googleapis-common-protos==1.56.4
  Downloading googleapis_common_protos-1.56.4-py2.py3-none-any.whl (211 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.7/211.7 KB 4.6 MB/s eta 0:00:00
Collecting protobuf<5.0.0dev,>=3.15.0
  Downloading protobuf-4.21.5-cp37-abi3-manylinux2014_x86_64.whl (408 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 408.4/408.4 KB 8.4 MB/s eta 0:00:00
Installing collected packages: protobuf, googleapis-common-protos
Successfully installed googleapis-common-protos-1.56.4 protobuf-4.21.5

This error appears:

python -c 'import google.cloud.logging; client = google.cloud.logging.Client()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'google.cloud.logging'

I noticed that the package gax-google-logging-v2 breaks the config.

(forensic) anatolii@siftworkstation: ~/forensic/dftimewolf
$ pip list |grep logging
gax-google-logging-v2        0.8.3
google-cloud-logging         2.3.1
grpc-google-logging-v2       0.8.1
(forensic) anatolii@siftworkstation: ~/forensic/dftimewolf
$ python -c 'import google.cloud.logging; client = google.cloud.logging.Client()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'google.cloud.logging' has no attribute 'Client'
(forensic) anatolii@siftworkstation: ~/forensic/dftimewolf
$ pip uninstall gax-google-logging-v2
Found existing installation: gax-google-logging-v2 0.8.3
Uninstalling gax-google-logging-v2-0.8.3:
  Would remove:
    /home/anatolii/forensic/lib/python3.9/site-packages/gax_google_logging_v2-0.8.3-nspkg.pth
    /home/anatolii/forensic/lib/python3.9/site-packages/gax_google_logging_v2-0.8.3.dist-info/*
    /home/anatolii/forensic/lib/python3.9/site-packages/google/cloud/logging/v2/*
Proceed (y/n)? y
  Successfully uninstalled gax-google-logging-v2-0.8.3
(forensic) anatolii@siftworkstation: ~/forensic/dftimewolf
$ python -c 'import google.cloud.logging; client = google.cloud.logging.Client()'
(forensic) anatolii@siftworkstation: ~/forensic/dftimewolf

Is this package really necessary? What is using it?

A bit late to the party here, apologies - we moved away from pip and pipenv in favor of poetry, this wasn't reflected in the docs (#675). Please use poetry to install dftimewolf, otherwise you migth run into dependency conflicts.

Still doesn't work, though used poetry for proper installation:

$ dftimewolf gcp_logging_collect koz-bad 'timestamp>="2022-11-28T19:00:00"'
Messages
  [ dftimewolf ] Debug log: /tmp/dftimewolf-run-20221128_195510_ximdl1al.log
  [ dftimewolf ] An unknown error occurred in module GCPLogsCollector: module 'google.cloud.logging' has no attribute 'Client'
  [ dftimewolf ] Critical error found. Aborting.
 
$ python3 -c 'import google.cloud.logging; client = google.cloud.logging.Client()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'google.cloud.logging' has no attribute 'Client'

$ pip uninstall gax-google-logging-v2
Found existing installation: gax-google-logging-v2 0.8.3
Uninstalling gax-google-logging-v2-0.8.3:
  Would remove:
    /home/anatolii/.cache/pypoetry/virtualenvs/dftimewolf-pBwI8mxo-py3.10/lib/python3.10/site-packages/gax_google_logging_v2-0.8.3-py3.10-nspkg.pth
    /home/anatolii/.cache/pypoetry/virtualenvs/dftimewolf-pBwI8mxo-py3.10/lib/python3.10/site-packages/gax_google_logging_v2-0.8.3.dist-info/*
    /home/anatolii/.cache/pypoetry/virtualenvs/dftimewolf-pBwI8mxo-py3.10/lib/python3.10/site-packages/google/cloud/logging/v2/*
Proceed (Y/n)? Y
  Successfully uninstalled gax-google-logging-v2-0.8.3

$ python3 -c 'import google.cloud.logging; client = google.cloud.logging.Client()'

$ dftimewolf gcp_logging_collect koz-bad 'timestamp>="2022-11-28T19:00:00"'
Messages
  [ dftimewolf       ] Debug log: /tmp/dftimewolf-run-20221128_200635_shjq4jkr.log
  [ GCPLogsCollector ] Downloaded logs to /tmp/tmp69hu7a6h.jsonl

$ wc -l /tmp/tmp69hu7a6h.jsonl
26593 /tmp/tmp69hu7a6h.jsonl

Still some troubles with the package gax-google-logging-v2-0.8.3

I dug into this a bit more, and it seems that gax-google-logging-v2-0.8.3 is a third or fourth party dependency of Turbinia. They're currently changing the API client to drop all GCP requirements, which should alleviate this requirement and hopefully fix this. I should check whether the Turbinia recipes fail if this lib isn't installed.

Turbinia depdends on psq, which has been unmaintained for the past 3 years, and which in turns depends on gax-google-logging-v2, which has been unmaintained for even longer (2016?)

The best workaround for now is to pip uninstall ax-google-logging-v2 like @AnatoliyNelyubin recommended. Still investigating whether that will break turbinia modules.