/clp-notification-monitor

SeaweedFS Ingestion Monitor

Primary LanguagePython

CLP Notification Monitor

CLP Log Ingestion Notification Monitor.

Development

  1. Create and enter the virtual environment: python3.x -m venv v3x; . ./v3x/bin/activate
  2. Update pip: pip install --upgrade pip
  3. Install the project in editable mode along with the development dependencies: pip install -e .[dev]

Start

To ingest files directly via the S3 API:

python start.py \
--seaweed-filer-endpoint $SEAWEEDFS_ENDPOINT \
--seaweed-s3-endpoint-url $SEAWEEDFS_S3_GATEWAY \
--filer-notification-path-prefix $NOTIFICATION_PATH_PREFIX \
--db-uri $DATABASE_URI \
--max-buffer-size 16777216 \
--min-refresh-frequency 5000 \
s3

To ingest files via mounted directories in the local filesystem:

python start.py \
--seaweed-filer-endpoint $SEAWEEDFS_ENDPOINT \
--seaweed-s3-endpoint-url $SEAWEEDFS_S3_GATEWAY \
--filer-notification-path-prefix $NOTIFICATION_PATH_PREFIX \
--db-uri $DATABASE_URI \
--max-buffer-size 16777216 \
--min-refresh-frequency 5000 \
fs --seaweed-mnt-prefix  $SEAWEED_MNT_PREFIX

To introduce job grouping by folder names under a specific path prefix, add the following option:

--group-by-paths-under-prefix $GROUP_BY_PATH_UNDER_PREFIX

Contributing

Before submitting a pull request, run the following error-checking and formatting tools:

  • mypy: mypy src
    • mypy checks for typing errors. You should resolve all typing errors or if an error cannot be resolved (e.g., it's due to a third-party library), you should add a comment # type: ignore to silence the error.
  • docformatter: docformatter -i src
    • This formats docstrings. You should review and add any changes to your PR.
  • Black: black src
    • This formats the code according to Black's code-style rules. You should review and add any changes to your PR.
  • ruff: ruff check --fix src
    • This performs linting according to PEPs. You should review and add any changes to your PR.

Note that docformatter should be run before black to give Black the last.