kevinsteves/pan-python

panconf produces invalid set output

dhalperi opened this issue · 6 comments

Hey @kevinsteves - thanks for this lib.

The actual CLI on devices does not produce valid output if users have newlines, quotes, etc in their descriptions, etc. So we recommend our Batfish users to use Panconf to fetch XML and then convert it to SET via panconf: https://pybatfish.readthedocs.io/en/latest/formats.html#from-panorama-preferred

We're running into an issue where XML nodes that contain complicated text (like JSON) make output files that are unparseable. Panconf is better than the device CLI, but it has similar issues. I think the culprit is this code:

def __quote_arg(self, s):
# XXX string with " etc.
if '"' in s:
return "'%s'" % s
if ' ' in s:
return '"%s"' % s
return s

It clearly will do the wrong thing on, e.g., a cell that contains only '" (single-quote double-quote) -- it will output ''"' (single/single/double/single) which is still not a valid string (three single-quotes).

Is this desired behavior for some reason? If not, would you be open to a PR that does some escaping to produce parseable output? Do you have an opinion for how that output should look?

Thanks!

I attempted to duplicate the CLI config mode set output, recognizing that in some cases it could not be validly used as set input.

admin@PA-220# show address test
set address test ip-netmask 10.1.1.2
set address test description 'foo'"bar'

Besides quote and double quote in a string, and multi-line strings, what are the other invalid cases?

We could add an option that when a set string needs to be quoted, to make it a JSON string.

I attempted to duplicate the CLI config mode set output, recognizing that in some cases it could not be validly used as set input.

I figured as much, and that makes sense. (It would be nice if they changed the CLI behavior, and we could match that :p, but I figure that's a much larger product decision.)

Besides quote and double quote in a string, and multi-line strings, what are the other invalid cases?

Come to think of it, the quotes is probably all of it. For us, the issue crops up when users have JSON and other things like that in their strings, plus the examples you showed. (I think we had various other heuristic strategies, but none of them worked.)

Even multi-line strings are fine if we can match up quotes.

We could add an option that when a set string needs to be quoted, to make it a JSON string.

That sounds like a great idea to me, and what I was planning on doing if I forked. I think it would very likely solve our problem.

https://github.com/kevinsteves/pan-python/tree/br0

Take a look at this, and let me know if that solves your use case, or have any input.

This works pretty well, but, e.g., if the input was \t. it left that tab in there.

A more robust implementation might be to search for any whitespace: if re.search('[\s"\']', s): # use json.dumps

Good suggestion. Changed to use re.search().

The other thing it would be helpful to also escape would be []. This will enable us to distinguish between [ a, b ] (a list) and [ab] the string, which can occur in places like url-filtering.