mpenning/ciscoconfparse

Config parsing breaks with `ignore_blank_lines=True` (object parent assignments are missed)

CrackerJackMack opened this issue · 9 comments

I was struggling with an issue for a good while today why the first line of every section of my config was being truncated. Turns out it's really a fatal flaw in many of the routines. In many places there is for idx, obj in enumerate(self._list): where it is expected that idx == obj.linenum. Especially during __init__. I say this because many functions call iter_with_comments or iter_no_comments and pass in obj.linenum as the the begin_index parameter. Thus creating the relationship between list index and the config line number.

I'm submitting this as an issue instead of a pull request directly since I'm not sure of the intended use of the iter_*_comments functions (they are public api). So it is a conundrum on fixing this issue. Wasn't sure the best way to go about fixing it, but am willing to do so with some guidance on the matter.

_reassign_linenums also using the enumeration method to re-assign line numbers. I guess it could be said that after the initial parsing is done that you could call _reassign_linenums but that seem bruteforce-ish. Would seem more logical to handle that in the insert/append functions?

This seems to fix it for my use case (without changing the default) but could be breaking a public API. /me shrugs

     def iter_with_comments(self, begin_index=0):
-        for idx, obj in enumerate(self._list):
-            if (idx>=begin_index):
+        for obj in self._list:
+            if obj.linenum >= begin_index:
                 yield obj

     def iter_no_comments(self, begin_index=0):
-        for idx, obj in enumerate(self._list):
-            if (idx>=begin_index) and (not obj.is_comment):
+        for obj in self._list:
+            if obj.linenum >= begin_index and not obj.is_comment:
                 yield obj

Hi Kevin, please add an example config which shows how status quo is broken; perhaps the best way to do this is as a python triple-quoted string. Also include how you are parsing (i.e. with the factory option)?

I was using all defaults on ciscoconfparse==1.1.23 passing in the filename as config.

I'm trying to sanitize a reproducible config, this one is 4101 lines long (wc -l)

https://gist.github.com/CrackerJackMack/409b6b2f9e0f0caf0ced

make sure you download the raw as those spaces can be gobbled up in copy and paste

Great bug report; thank you. This problem was introduced when I re-wrote the library in 0.9.x... then, I added the explicit ignore_blank_lines option to resolve the concern in PR #3 (but the problem already seemed to exist at that time). BTW, your gist mentioned that loopback1's ipv6 address is missed; I'm not sure if you realized it, but loopback2's ipv4 address is broken too.

Your proposed modifications to iter_with_comments() and iter_no_comments() passes unit tests, but I'd like to take some time to consider the best way to fix this. The actual problem is that child assignments to parent objects break in the config you provided; specifically the second child never gets assigned to the parent, even if the first child is a comment. The problem seems to be triggered if there are two blank lines before a parent line.

I'm including a minimal, complete example to reproduce the problem in this bug report...

import os
from ciscoconfparse import CiscoConfParse

config = """!
!
router ospf
 area 0 
!
router isis
 net 10.10.10.10.10
 address-family ipv4 unicast
  default-metric 10
 exit-address-family


!
interface loopback 1
 ip address 192.168.0.1/32
 ipv6 address fe80::30/128
 ipv6 enable
 ipv6 router isis
 !! BUG The "ipv6 address" line never has a parent assigned to it.
 !! This problem happens for every 2nd child of a parent, if the config has two balnk
 !! lines before the parent
!
interface loopback 2
 port-name brocade config example
 ip address 192.168.1.1/32
 !! BUG The "ip address" line never has a parent assigned to it.
!
!
end
"""

cfg_list = config.split(os.linesep)

broken = CiscoConfParse(cfg_list, ignore_blank_lines=True)
for obj in broken.ConfigObjs:
    print obj

I just pushed 1.1.24, which should fix the problem. Can you test on your side and confirm?

FYI... I also pushed 1.2.0 this morning, which includes optimizations giving me between 25% and 24,000% parsing speed improvement (depending on python version and the type of configs)... if you have Cisco ASAs, it goes into beast-mode when you parse as CiscoConfParse('/path/to/config', syntax='asa', factory=True)... reduced a 20k-line ASA config parse from two minutes, forty seconds in 1.1.24, to 0.65 second in 1.2.0.

I updated to 1.2.1 as you asked and removed all but the config filename. Noticed not only the huge speedups but my original issue has been fixed as well. Thanks for working on this! This project has really been a life saver.