jhthorsen/net-isc-dhcpd

parser changes

rfdrake opened this issue · 7 comments

I want to see if I can reduce the number of method calls in the children
processing. One way to do it might be to build a combined regex
beforehand. Then instead of doing a for loop we do an "if" statement
and try to derive which regex matched it. (Maybe named capture?)

Then call the appropriate method based on that.

Problems with this: children is dynamic depending on the type of parent,
so the regexes could be built once but they would need to be stored in
cache somewhere and the parent object determines which one gets used.

Are you sure the regex is dependent on the parent? I would guess the children that can be created is dependent on the parent, but that the regex (tokenizer) is the same.

It looks like the next big thing is NetAddr::IP->new() being slow. We're to the point where that is eating up 20s out of a total 46s.

These two captured_to_args queries are slower because they call NetAddr::IP. It would be nice if we could still have the benefits of NetAddr::IP but hold off on doing the validation/creation of it until a function was called.

I could change to treating them as strings but the API is written for them to be NetAddr::IP objects.

 # spent 10.3s making 80000 calls to Net::ISC::DHCPd::Config::Range::captured_to_args, avg 129µs/call 
# spent 9.04s making 80000 calls to Net::ISC::DHCPd::Config::Subnet::captured_to_args, avg 113µs/call

@rfdrake: Maybe you could create a dummy NetAddr::IP?

Something like

package Net::ISC::DHCPd::IP;
use strict;
use warnings;
use NetAddr::IP;
sub new { my $class = shift; bless {x=>[@_]}, $class }
sub can { NetAddr::IP->can($_[1]); }
sub isa { $_[1] eq 'NetAddr::IP'; }
sub AUTOLOAD {
  my $self = shift;
  my $obj = NetAddr::IP->new(@{ $self->{x} });
  %$self = %$obj;
  bless $self, 'NetAddr::IP';
  our $AUTOLOAD =~ /::(\w+)$/;
  $self->$1(@_);
}

It's pretty ugly, but it's 10 times faster during construction time:

timethese(-3, {
  'NetAddr_IP_mask' => sub { NetAddr::IP->new("127.0.0.1")->mask; },
  'Net_ISC_DHCPd_IP_mask' => sub { Net::ISC::DHCPd::IP->new("127.0.0.1")->mask; },
  'NetAddr_IP' => sub { NetAddr::IP->new("127.0.0.1"); },
  'Net_ISC_DHCPd_IP' => sub { Net::ISC::DHCPd::IP->new("127.0.0.1"); },
});

Benchmark: running NetAddr_IP, NetAddr_IP_mask, Net_ISC_DHCPd_IP, Net_ISC_DHCPd_IP_mask for at least 3 CPU seconds...
NetAddr_IP:  3 wallclock secs ( 2.90 usr +  0.33 sys =  3.23 CPU) @ 92518.89/s (n=298836)
NetAddr_IP_mask:  3 wallclock secs ( 2.97 usr +  0.22 sys =  3.19 CPU) @ 56624.14/s (n=180631)
Net_ISC_DHCPd_IP:  3 wallclock secs ( 3.12 usr +  0.00 sys =  3.12 CPU) @ 1184221.79/s (n=3694772)
Net_ISC_DHCPd_IP_mask:  3 wallclock secs ( 2.99 usr +  0.18 sys =  3.17 CPU) @ 41964.98/s (n=133029)

Maybe the package could be even smarter and have the ability to string overload back to the original input? That way you don't have to create objects when re-constructing the config file.

@rfdrake: Hehe, cool that you created NetAddr::IP::LazyInit :) Could you replace "jhthorsen" with L<Jan Henning Thorsen|https://metacpan.org/author/JHTHORSEN> ?

I did look at it a while back. I thought it would be the correct way to implement a DHCP parser, but it was too complicated for me to figure out. It would be interesting to implement it and find out how much faster it would be.

Closing this because it's super old.