[BUG]: Sanitizing regex does not exclude string literals

Question

[BUG]: Sanitizing regex does not exclude string literals

taldcroft opened this issue 6 months ago · 3 comments

4b2d89c introduces a regression when an expression includes a string literal with any of the new forbidden characters. This is breaking our production code when we upgrade numexpr to 2.8.7.

Example:

>>> import numexpr as ne
>>> ne.__version__
'2.8.7'
>>> import numpy as np

>>> x = np.array(['a', 'b'], dtype=bytes)
>>> ne.evaluate("x == 'b'")
array([False,  True])

>>> ne.evaluate("x == 'b:'")
Traceback (most recent call last):
  Cell In[6], line 1
    ne.evaluate("x == 'b:'")
  File ~/miniconda3/envs/numexpr/lib/python3.10/site-packages/numexpr/necompiler.py:975 in evaluate
    raise e
  File ~/miniconda3/envs/numexpr/lib/python3.10/site-packages/numexpr/necompiler.py:872 in validate
    _names_cache[expr_key] = getExprNames(ex, context, sanitize=sanitize)
  File ~/miniconda3/envs/numexpr/lib/python3.10/site-packages/numexpr/necompiler.py:721 in getExprNames
    ex = stringToExpression(text, {}, context, sanitize)
  File ~/miniconda3/envs/numexpr/lib/python3.10/site-packages/numexpr/necompiler.py:281 in stringToExpression
    raise ValueError(f'Expression {s} has forbidden control characters.')
ValueError: Expression x == 'b:' has forbidden control characters.

Answer 1 · 2024-01-18T21:02:00.000Z

This could be fixed by firstly replacing content within quotes before trying to match blacked list. I will fix this and add some tests.

Answer 2 · 2024-01-24T10:09:27.000Z

Thanks, looking forward to the next release! Looks like this can be closed now?

Answer 3 · 2024-01-24T10:14:00.000Z

Yes ^-^

…

On Wed, Jan 24, 2024 at 11:09 Tom Aldcroft ***@***.***> wrote: Thanks, looking forward to the next release! Looks like this can be closed now? — Reply to this email directly, view it on GitHub <#468 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A33BDH3LGQ6RIORE2VXP6ALYQDMWFAVCNFSM6AAAAABBI5QLPSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBXHAYTCNJWGU> . You are receiving this because you commented.Message ID: ***@***.***>