dgtlmoon/changedetection.io

[feature] Support multiline regex in text filtering

Opened this issue · 4 comments

It seems that Trigger/wait for text, Ignore lines containing, Block change-detection while text matches in Text filtering section do not support multiline regex.

Narrowed it down to:

def strip_ignore_text(content, wordlist, mode="content"):
i = 0
output = []
ignore_text = []
ignore_regex = []
ignored_line_numbers = []
for k in wordlist:
# Is it a regex?
res = re.search(PERL_STYLE_REGEX, k, re.IGNORECASE)
if res:
ignore_regex.append(re.compile(perl_style_slash_enclosed_regex_to_options(k)))
else:
ignore_text.append(k.strip())
for line in content.splitlines(keepends=True):
i += 1
# Always ignore blank lines in this mode. (when this function gets called)
got_match = False
for l in ignore_text:
if l.lower() in line.lower():
got_match = True
if not got_match:
for r in ignore_regex:
if r.search(line):
got_match = True
if not got_match:
# Not ignored, and should preserve "keepends"
output.append(line)
else:
ignored_line_numbers.append(i)
# Used for finding out what to highlight
if mode == "line numbers":
return ignored_line_numbers
return ''.join(output)

The function iterates over the content line by line and matches each regex to each line:

for line in content.splitlines(keepends=True):

The function could be reworked to use re.finditer/re.findall on the whole content instead.

it COULD be reworked, but then it would maybe break all existing filters, whats your thoughts on how to handle that?

Unless I'm missing something it would only break regex filters that have s or m flags set (currently those flags have no effect) or regex that captures \n in the middle of the pattern (currently such regex matches nothing). Everything else should behave the same.

Other option is to match on the whole content only when the s or m flag is set, otherwise use the current implementation.

Other option is to match on the whole content only when the s or m flag is set, otherwise use the current implementation.

yes! what i'm thinking.. any downsides?

Downsides are that you have to supports two versions of text filtering, and that filters that already set s or m flags could break.