/pySigma-backend-databricks

pySigma Databricks backend

Primary LanguagePythonMIT LicenseMIT

Tests ![Coverage Badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/alexott/GitHub Gist identifier containing coverage badge JSON expected by shields.io./raw/alexott-databricks-sigma-backend.json) Status

Status: experimental, work in progress:

  • Although cidrmatch is generated, you still need to provide corresponding function as UDF (I'll add example later)
  • Keywords (text rules without specific field) aren't supported yet
  • Requires more testing

pySigma Databricks Backend

This is the Databricks backend for pySigma. It provides the package sigma.backends.databricks with the DatabricksBackend class. Further, it contains the following processing pipelines in sigma.pipelines.databricks:

  • pipeline1: purpose
  • pipeline2: purpose

It supports the following output formats:

  • default: plain Databricks/Apache Spark SQL queries
  • dbsql: Databricks SQL queries with rules metadata (title, status) embedded as comment
  • detection_yaml: Yaml markup for my own detection framework

This backend is currently maintained by:

TODOs

  • Try to rewrite expressions like foo*bar into (startswith(field, "foo") and endswith(field, "bar"))
  • fix escaping in the lower/upper functions - don't do this: lower('com\.objective-see\.lulu\.plist')
  • Fix rules like "Huawei BGP Authentication Failures"
  • Add support for all regexp modifiers, like, dotall, m/multiline, i/ignorecase, ...