delvelabs/batea

Fails to parse nmap output files

Closed this issue · 5 comments

❯ batea -h
Usage: batea [OPTIONS] [NMAP_REPORTS]...

  Context-driven asset ranking based using anomaly detection

Options:
  -c, --read-csv FILENAME
  -x, --read-xml FILENAME
  -n, --n-output INTEGER
  -A, --output-all
  -L, --load-model FILENAME
  -D, --dump-model FILENAME
  -f, --input-format TEXT
  -v, --verbose
  -oM, --output-matrix FILENAME
  -h, --help                     Show this message and exit.
❯ batea -v -x output_3.xml
/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/sklearn/ensemble/iforest.py:478: RuntimeWarning: invalid value encounte                                                                                                                                              red in true_divide
  -depths
/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/sklearn/ensemble/iforest.py:478: RuntimeWarning: invalid value encounte                                                                                                                                              red in true_divide
  -depths
{
    "report_info": [
        {
            "number_of_hosts": 1,
            "features": [
                "ip_octet_0",
                "ip_octet_1",
                "ip_octet_2",
                "ip_octet_3",
                "port_count",
                "open_port_count",
                "low_port_count",
                "tcp_port_count",
                "named_service_count",
                "software_banner_count",
                "max_banner_length",
                "is_windows",
                "is_linux",
                "http_server_count",
                "database_count",
                "windows_domain_admin_count",
                "windows_domain_member_count",
                "port_entropy",
                "hostname_length",
                "hostname_entropy"
            ]
        }
    ],
    "host_info": [
        {
            "rank": "1",
            "host": "x.x.x.113",
            "score": NaN,
            "hostname": null,
            "os": {
                "vendor": "Linux",
                "family": "Linux",
                "type": "general purpose",
                "name": "Linux 3.10 - 4.8",
                "accuracy": 96
            },
            "features": {
                "ip_octet_0": x,
                "ip_octet_1": x,
                "ip_octet_2": x,
                "ip_octet_3": 113.0,
                "port_count": 3.0,
                "open_port_count": 3.0,
                "low_port_count": 3.0,
                "tcp_port_count": 3.0,
                "named_service_count": 3.0,
                "software_banner_count": 3.0,
                "max_banner_length": 7.0,
                "is_windows": 0.0,
                "is_linux": 1.0,
                "http_server_count": 2.0,
                "database_count": 0.0,
                "windows_domain_admin_count": 0.0,
                "windows_domain_member_count": 0.0,
                "port_entropy": 1.584962500721156,
                "hostname_length": 0.0,
                "hostname_entropy": 0.0
            }
        }
    ]
}
❯ batea -v -x output.xml
Traceback (most recent call last):
  File "/home/drdinosaur/.pyenv/versions/3.8.0/bin/batea", line 11, in <module>
    load_entry_point('batea', 'console_scripts', 'batea')()
  File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/drdinosaur/batea/batea/__main__.py", line 60, in main
    report.hosts.extend([host for host in xml_parser.load_hosts(file)])
  File "/home/drdinosaur/batea/batea/__main__.py", line 60, in <listcomp>
    report.hosts.extend([host for host in xml_parser.load_hosts(file)])
  File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 29, in load_hosts
    host = self._generate_host(child)
  File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 37, in _generate_host
    ports=self._find_ports(subtree))
  File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 52, in _find_ports
    for port in host.find("ports").findall("port"):
AttributeError: 'NoneType' object has no attribute 'findall'
❯ batea -v -x output_2.xml
Traceback (most recent call last):
  File "/home/drdinosaur/.pyenv/versions/3.8.0/bin/batea", line 11, in <module>
    load_entry_point('batea', 'console_scripts', 'batea')()
  File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/drdinosaur/batea/batea/__main__.py", line 60, in main
    report.hosts.extend([host for host in xml_parser.load_hosts(file)])
  File "/home/drdinosaur/batea/batea/__main__.py", line 60, in <listcomp>
    report.hosts.extend([host for host in xml_parser.load_hosts(file)])
  File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 29, in load_hosts
    host = self._generate_host(child)
  File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 37, in _generate_host
    ports=self._find_ports(subtree))
  File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 52, in _find_ports
    for port in host.find("ports").findall("port"):
AttributeError: 'NoneType' object has no attribute 'findall'

The tool is able to handle an output with one host, but errors out with 9 or 10 host files.

Hey, is it possible to send me the xml file so I can find out what happens? It runs fine on my side.

Sorry, forgot to reply. Here's one that didn't work: https://pastebin.com/V5d9VnST

Sorry, forgot to reply. Here's one that didn't work: https://pastebin.com/V5d9VnST

Hey I'm really sorry I was on vacation without any computer. Is it possible to repost it or send me a private link. I am now fully back and available to work on it.

Thanks

The problem is fixed, we now have a simple check if a device doesn't have any port.