libusrxml: An Awk repository from serhepopovych

Simple XML-like format parsing libarary and useful utilities written in awk(1)
==============================================================================

Data format is specific to define traffic control rules for given network
service provider subscriber (called just "user" for simplicity).

This code however can be an example on how to parse XML or similar data using
awk(1) from shell scripts.

Document format
---------------

Document contains multiple user definitions each of which defines interface
user connected, set of IPv4/IPv6 networks/addresses assigned to user, zones
with directions and allocated bandwidth.

Zones, directions and bandwidths are grouped together within pipe data
structure.

  Zones can be "local" for traffic belonging to local networks (e.g.
  country/city IXPs, peering connections, etc), "world" for rest of
  networks and "all" to treat everything equally.

  Directions can be "in" for incoming traffic to user (download), "out"
  for outgoing traffic from user (upload) or "all" to apply for both
  "in" and "out" directions.

  Bandwidth gives allocated bandwidth for direction in zone. It is
  specified in Kbit/s.

Here is typical example of document format library is able to parse:

<user WN2019011501>
	<pipe 1>
		<zone local>
		<dir all>
		<bw 102400Kb>
	</pipe>
	<pipe 2>
		<zone world>
		<dir in>
		<bw 10240Kb>
	</pipe>
	<pipe 3>
		<zone world>
		<dir out>
		<bw 5120Kb>
	</pipe>
	<if eth0.4094>
	<net 203.0.113.130/30>
		<src 203.0.113.129>
		<mac 02:11:22:33:44:55>
	</net>
	<net 192.0.2.128/30>
		<via 203.0.113.130>
	</net>
</user>

This document describes

Library API
-----------

There are three main API functions provided by libusrxml.awk module:

  BEGIN{
      # Prepare parser by initializing internal variables. This should be
      # very first routine called from the library.
      # Usually called from BEGIN{} awk(1) program section.

      h = init_usrxml_parser("program name");
      if (h < 0)
          exit 1;
  }

  {
      # Takes single line of data and parses it to internal data structures.
      # Usually called from main {} awk(1) program section.

      if (run_usrxml_parser(h, line) < 0)
          exit 1;
  }

  END{
      # Destroy parser data structures for given handle @.
      # Usually called from END{} awk(1) program section.

      if (fini_usrxml_parser(h) < 0)
          exit 1;
  }

All these functions return zero or userid + 1 on success or less than zero on
error. One can use usrxml_errno() to get integer value representing error code
and special constant-like defines from init_usrxml_consts() to determine exact
reason.

See libusrxml.awk init_usrxml_parser() function for USRXML schema to
associative arrays elements mapping that is available after user parsed.

There is two additional functions defined in the library to print <user>
that can be used only with successfuly parsed document:

  # Use friendly output format with each <tag> placed on their
  # own newline and tabs for indendation.
  if (print_usrxml_entry(h, userid) < 0)
      exit 1;

  # Put everything on single line. This format is useful for machine
  # processing (e.g. search with grep(1)).
  if (print_usrxml_entry_oneline(h, userid) < 0)
      exit 1;

These functions can be used as run_usrxml_parser() callback to print user
entry once it is ready:

  if (run_usrxml_parser(h, line, "print_usrxml_entry") < 0)
      exit 1;

Examples
--------

There is users_xml2lst.awk and users_lst2xml.awk binaries that parse
document and output it with each <user> tag on single line or normal
format.

Assuming you have installed these scripts under /netctl/bin you can
use following command sequence to verify document correctness:

  $ cat >/tmp/usr.xml <<EOF
<user WN2019011502>
	<pipe 1>
		<zone world>
		<dir all>
		<bw 20480Kb>
	</pipe>
	<if eth0.4094>
	<net 203.0.113.97/32>
		<src 203.0.113.1>
		<mac 02:11:22:33:44:55>
	</net>
</user>
EOF

  $ /netctl/bin/users_xml2lst.awk /tmp/usr.xml | \
  /netctl/bin/users_lst2xml.awk

These tools can also be used for document validation from shell.
serhepopovych/libusrxml