The whole code is documented by docstrings and inline comments. Refer to these to get a deep understanding.
- Python 3 (docker?)
- Azure Event Hubs namespace and hub
- Azure Block Blob storage account with container
- Running TimescaleDB (Setup/ See below) with table that support the plugins output.
- Parser plugin for the messages in that eventhub available in the plugin folder
Foreach event hub, you need one seperate storage container to save checkpoints.
Currently the below isn't implemented, change in
app.py
The following locations are checked in order
- Value of environment variable
$EH_READER_CONFIG
as path - Home directory (linux:
~/eh_reader_config.yml
, windows:$HOMEPATH\eh_reader_config.yml
) - Current directory (
./eh_reader_config.yml
) - linux:
/etc/monitoring/eh_reader_config.yml
or windows:C:\monitoring\eh_reader_config.yml
DB_connection_string: "<timescale db connection string>"
EH_connection_string: "<event hub connection string>"
SA_connection_string: "<storage account connection string>"
SA_container_name: "<storage account container name"
DB_writer_count: <number of threads that write to db>
Expected connection string formats:
DB_connection_string:
"dbname=<name of db> user=<db user> password=<password for db user> host=<host of db> port=<db port>"
- by default the user is
postgres
and port is5432
EH_connection_string:
- first part: look here
- add
";EntityPath=<name of service hub>"
- resulting:
"Endpoint=...;SharedAccessKeyName=...;SharedAccessKey=...;EntityPath=..."
SA_connection_string:
-
Create config file (see above)
-
run
pip install -r requirements.txt
requirements.txt
currently includespylint
andblack
as formatter
-
Change the default path in app.py (you can also use the command line to pass the full path)
-
Run app.py and enjoy!
- implement a config file finder (search different paths as shown in the above definition)
- create plugins for all formats that should be supported
- clean up requirements?