leondz/garak

Bug in Leakreplay Probe

bleszily opened this issue · 2 comments

I want to report an issue that was discovered in the leakreplay.py file for the leakreplay probe.
I ran Garak with the command: python -m garak -m rest --generator_option_file restConfig.json -d guardrail.BinaryGuardrailDetector --probes leakreplay [and other probes]

All other probes were successful but leakreplay threw an error.

Error: Error:

 Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in runcode
  File "/opt/bleszily/garak/garak/main.py", line 14, in <module>
    main()
  File "/opt/bleszily/garak/garak/__main.py", line 9, in main
    cli.main(sys.argv[1:])
  File "/opt/bleszily/garak/garak/cli.py", line 513, in main
    command.pxd_run(
  File "/opt/bleszily/garak/garak/command.py", line 229, in pxd_run
    pxd_h.run(
  File "/opt/bleszily/garak/garak/harnesses/pxd.py", line 61, in run
    h.run(model, [probe], detectors, evaluator, announce_probe=False)
  File "/opt/bleszily/garak/garak/harnesses/base.py", line 108, in run
    attempt_results = probe.probe(model)
                      ^^^^^^^^^^^^^^^^^^
  File "/opt/bleszily/garak/garak/probes/base.py", line 219, in probe
    attempts_completed = self._execute_all(attempts_todo)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/bleszily/garak/garak/probes/base.py", line 197, in _execute_all
    result = self._execute_attempt(this_attempt)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/bleszily/garak/garak/probes/base.py", line 161, in _execute_attempt
    this_attempt = self._postprocess_hook(this_attempt)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/bleszily/garak/garak/probes/leakreplay.py", line 68, in postprocesshook
    attempt.messages[idx][-1]["content"] = re.sub(
                                           ^^^^^^^
  File "/root/.conda/envs/garak/lib/python3.11/re/__init.py", line 185, in sub
    return _compile(pattern, flags).sub(repl, string, count)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'NoneType'

When I investigated the error:
I discovered an error in leakreplay.py probe [TypeError related to an expected string or bytes-like object but receiving NoneType]. I think the issue is because of an operation attempting to perform a substitution using regular expressions on a value that is unexpectedly None.
From my analysis, the error suggests that there is a lack of proper validation or error handling when handling text content that might be None:

Location: /opt/bleszily/garak/garak/probes/leakreplay.py
Function: _postprocess_hook
The use of re.sub expects a string or bytes-like object, but it receives None, leading to a TypeError.

Proposed Fix:
I think we can add a check in the leakreplay.py module to ensure that the variable used in the substitution operation is not None before attempting to process it.
Here is an update I added:

def _postprocess_hook(self, attempt: Attempt) -> Attempt:
    for idx, thread in enumerate(attempt.messages):
        # Ensure content is not None before applying regex
        if thread and thread[-1]["content"] is not None:
            attempt.messages[idx][-1]["content"] = re.sub(
                "</?name>", "", thread[-1]["content"]
            )
        else:
            # Handle None or empty thread case by logging or assigning a default string
            logging.warning(f"No content to process for message index {idx}. Setting default empty string.")
            if thread:
                attempt.messages[idx][-1]["content"] = ""
            else:
                # If thread is entirely absent, log this as it might indicate a larger issue
                logging.error(f"Thread at index {idx} is missing or malformed.")
    return attempt
Updated the _postprocess_hook method with added checks to prevent the TypeError

I had to import logging also.

After updating the leakreplay.py file with these changes, it works fine.

Thanks, will take a look

We haven't reproduced this yet but it looks like it could be high priority. Will get back to you.