Hello World Processor for Apache NiFi Using Hatch

Step 1: Setting Up Your Environment

  1. Install Python: Make sure you have Python 3.11 or later installed on your system. You can check your Python version by running:

    python --version
  2. Installing Hatch for NiFi Python Processor Development

    Option 1: Using Homebrew (macOS)

    For macOS users who manage their packages with Homebrew, installing Hatch is straightforward:

    brew install hatch

    This method is simple and integrates well with other Homebrew-managed applications.

    Option 2: Using pipx

    Pipx is a tool that installs Python applications in isolated environments, ensuring they do not interfere with each other or with the system Python packages. It's ideal for command-line tools like Hatch that are run globally.

    1. Install pipx: If pipx isn't already installed, you can install it first using pip:

      python3 -m pip install --user pipx
      python3 -m pipx ensurepath

      The ensurepath command makes sure the location of the pipx binaries is added to your system’s PATH.

    2. Install Hatch using pipx:

      pipx install hatch

      This command installs Hatch in its own isolated environment but makes it accessible from anywhere in your terminal.

    Option 3: Using a Virtual Environment (venv) Using a virtual environment is a good practice when you want to manage dependencies for a specific Python project without affecting other projects or the global Python environment.

    1. Create a virtual environment: Navigate to your project directory (or where you wish to store your environments), and run:

      python3 -m venv hatch-env

      This command creates a new directory called hatch-env that contains a local Python installation.

    2. Activate the virtual environment:

    • On macOS and Linux:

      source hatch-env/bin/activate
    • On Windows:

      .\hatch-env\Scripts\activate
    1. Install Hatch within the virtual environment:

      Once the virtual environment is activated, you can install Hatch using pip:

      pip install hatch

      This installation will be local to the hatch-env environment and won’t interfere with other projects.

  3. Choosing the Right Installation Method

    • Homebrew: Best for macOS users who prefer a simple, system-wide installation and manage most software through Homebrew.
    • pipx: Ideal for users who need to run Hatch as a command-line tool across multiple projects but want to keep it isolated from other Python packages.
    • venv: Recommended for users who are developing multiple Python projects on the same machine and need strict dependency management without cross-interference.

    Each method has its advantages depending on your setup and how you manage your development environments. For broader compatibility and ease of management, pipx is a particularly strong choice for tools like Hatch that are used across various projects.

Step 2: Create the Project Structure

  1. Generate a New Project: Run the following command to create a new Python project using Hatch, specfically for developing a Hello World NiFi Python processor. This command will create a new directory with the project structure and files needed for a NiFi processor development:

    hatch new hello-world-processor
  2. Navigate to Your Project Directory:

    cd hello-world-processor

Step 3: Configure the Project

  1. Update pyproject.toml: Add the necessary configurations for Hatch and the NAR plugin. You can either update the existing pyproject.toml with the following content or the content of the pyproject.toml file generated by Hatch in the previous step:

    [build-system]
    requires = ["hatchling", "hatch-datavolo-nar"]
    build-backend = "hatchling.build"
    
    [project]
    name = "hello-world-processor"
    version = "0.0.1"
    dependencies = []
    
    [tool.hatch.build.targets.nar]
    packages = ["src/hello_world_processor"]
  2. Create the Processor Code: In the src/hello_world_processor directory, create a new Python file for your processor, for example, hello_processor.py, and write a simple processor script:

    from nifiapi.flowfiletransform import FlowFileTransform, FlowFileTransformResult
    
    
    class WriteHelloWorld(FlowFileTransform):
        class Java:
            implements = ["org.apache.nifi.python.processor.FlowFileTransform"]
    
        class ProcessorDetails:
            version = "0.0.1-SNAPSHOT"
    
        def __init__(self, **kwargs):
            super().__init__(**kwargs)
    
        def transform(self, context, flowfile):
            return FlowFileTransformResult(
                relationship="success",
                contents="Hello World",
                attributes={"greeting": "hello"},
            )

    Please visit the NiFi Python Developer Guide for more information on the Apache NiFi Python processor development.

  3. Add an __init__.py: Make sure the src/hello_world_processor directory has an __init__.py file to make it a package.

Step 4: Build the NAR File

  1. Build the NAR:

    hatch build --target nar

    This command will package your processor along with any dependencies into a NAR file located in the dist directory.

Step 5: Deploy the NAR

  1. Deploy the NAR to Apache NiFi:

    • Ensure you have Apache NiFi 2.0.0-M3 or later installed.
    • Copy the generated NAR file to the lib directory of your NiFi installation.
  2. Restart Apache NiFi:

    • Restart NiFi to load the new processor.
  3. Verify Installation:

    • Open the NiFi web interface.
    • You should be able to add your new "Hello World" processor to a data flow.

This guide provides a starting point for developing Python processors in Apache NiFi using Hatch. Modify the processor code as needed for more complex functionalities.

Acknowledgements: I liberally used the Datavolo Hatch documentation and the NiFi Python Developer Guide to create this guide. Thank you to the authors of these resources for their valuable work in moving us all forward!